LAScatalog processing engine
This function gives users access to the LAScatalog processing engine.
It allows the application of a user-defined routine over an entire catalog. The LAScatalog
processing engine tool is explained in the LAScatalog classcatalog_apply()
is the core of the lidR package. It drives every single function that can process a
LAScatalog
. It is flexible and powerful but also complex. catalog_sapply
is the same
with the option automerge = TRUE
enforced to simplify the output.
Warning: the LAScatalog processing engine has a mechanism to load buffered data 'on-the-fly'
to avoid edge artifacts, but no mechanism to remove the buffer after applying user-defined functions,
since this task is specific to each process. In other lidR
functions this task is performed
specifically for each function. In catalog_apply()
the user's function can return any output,
thus users must take care of this task themselves (See section "Edge artifacts")
catalog_apply(ctg, FUN, ..., .options = NULL) catalog_sapply(ctg, FUN, ..., .options = NULL)
ctg |
A LAScatalog object. |
FUN |
A user-defined function that respects a given template (see section function template) |
... |
Optional arguments to FUN. |
.options |
See dedicated section and examples. |
It is important to take precautions to avoid 'edge artifacts' when processing wall-to-wall tiles. If the points from neighboring tiles are not included during certain processes, this could create 'edge artifacts' at the tile boundaries. For example, empty or incomplete pixels in a rasterization process, or dummy elevations in a ground interpolation. The LAScatalog processing engine provides internal tools to load buffered data 'on-the-fly'. However, there is no mechanism to remove the results computed in the buffered area since this task depends on the output of the user-defined function. The user must take care of this task (see examples) to prevent unexpected output with duplicated entries or conflict between values computed twice.
The LAS objects read by the user function have a special attribute called 'buffer' that indicates, for each point, if it comes from a buffered area or not. Points from non-buffered areas have a 'buffer' value of 0, while points from buffered areas have a 'buffer' value of 1, 2, 3 or 4, where 1 is the bottom buffer and 2, 3 and 4 are the left, top and right buffers, respectively. This allows for filtering of buffer points if required.
The parameter FUN
expects a function with a first argument that will be supplied automatically
by the LAScatalog
processing engine. This first argument is a LAScluster
. A LAScluster
is an internal undocumented class but the user needs to know only three things about this class:
A user-defined function must be templated like this:
myfun <- function(cluster, ...) { las <- readLAS(cluster) if (is.empty(las)) return(NULL) # do something # remove the buffer of the output return(something) }
The line if(is.empty(las)) return(NULL)
is important because some clusters (chunks) may contain
0 points (we can't know this before reading the file). In this case an empty point cloud with 0 points
is returned by readLAS()
and this may fail in subsequent code. Thus, exiting early from the user-defined
function by returning NULL
indicates to the internal engine that the cluster was empty.
From v3.0.0 if autoread = TRUE
the following template is accepted because the engine takes care
of the above mentionned steps:
myfun <- function(las, bbox ...) { # do something }
Users may have noticed that some lidR functions throw an error when the processing options are
inappropriate. For example, some functions need a buffer and thus buffer = 0
is forbidden.
Users can add the same constraints to protect against inappropriate options. The .options
argument is a list
that allows users to tune the behavior of the processing engine.
drop_null = FALSE
Not intended to be used by regular users. The engine does not remove
NULL outputs
need_buffer = TRUE
the function complains if the buffer is 0.
need_output_file = TRUE
the function complains if no output file template is provided.
raster_alignment = ...
the function checks the alignment of the chunks. This option is
important if the output is a raster. See below for more details.
automerge = TRUE
by defaut the engine returns a list`` with one item per chunk. If
automerge = TRUE, it tries to merge the outputs into a single object: a
Raster*“, a
Spatial*
, a 'LAS*' similar to other functions of the package. This is a fail-safe
option so in the worst case, if the merge fails, the 'list' is returned.
autoread = TRUE
. Introduced in v3.0.0 this option enables to get rid of the first steps of the
function i.e readLAS()
and if (is.empty())
. In this case the function must take two
objects as input, first a LAS
object and second a Extent
from raster
.
When the function FUN
returns a raster it is important to ensure that the chunks are aligned
with the raster to avoid edge artifacts. Indeed, if the edge of a chunk does not correspond to the edge
of the pixels, the output will not be strictly continuous and will have edge artifacts (that might
not be visible). Users can check this with the options raster_alignment
, that can take the
resolution of the raster as input, as well as the starting point if needed. The following are accepted:
# check if chunks are aligned with a raster of resolution 20 raster_alignment = 20 raster_alignment = list(res = 20) # check if chunks are aligned with a raster of resolution 20 # that starts at (0,10) raster_alignment = list(res = 20, start = c(0,10)) ``` See also \link{grid_metrics} for more details.
Supported processing options for a LAScatalog
(in bold). For more details see the
LAScatalog engine documentation:
chunk_size: How much data is loaded at once.
chunk_buffer: Load chunks with a buffer.
chunk_alignment: Align the chunks.
progress: Displays a progress estimate.
output_files: The user-defined function outputs will be written to files instead of being returned into R.
laz_compression: write las
or laz
files only if the user-defined function
returns a 'LAS“ object.
select: Select only the data of interest to save processing memory.
filter: Read only the points of interest.
# More examples might be avaible in the official lidR vignettes or # on the github book <https://jean-romain.github.io/lidRbook/> ## ========================================================================= ## Example 1: detect all the tree tops over an entire catalog ## (this is basically a reproduction of the existing lidR function 'tree_detection') ## ========================================================================= # 1. Build the user-defined function that analyzes each chunk of the catalog. # The function's first argument is a LAScluster object. The other arguments can be freely # choosen by the user. my_tree_detection_method <- function(cluster, ws) { # The cluster argument is a LAScluster object. The user does not need to know how it works. # readLAS will load the region of interest (chunk) with a buffer around it, taking advantage of # point cloud indexation if possible. The filter and select options are propagated automatically las <- readLAS(cluster) if (is.empty(las)) return(NULL) # Find the tree tops using a user-developed method (here simply a LMF). ttops <- find_trees(las, lmf(ws)) # ttops is a SpatialPointsDataFrame that contains the tree tops in our region of interest # plus the trees tops in the buffered area. We need to remove the buffer otherwise we will get # some trees more than once. bbox <- raster::extent(cluster) ttops <- raster::crop(ttops, bbox) return(ttops) } # 2. Build a project (here, a single file catalog for the purposes of this dummmy example). LASfile <- system.file("extdata", "MixedConifer.laz", package="lidR") prj <- readLAScatalog(LASfile) plot(prj) # 3. Set some processing options. # For this dummy example, the chunk size is 100 m and the buffer is 10 m opt_chunk_buffer(prj) <- 10 opt_chunk_size(prj) <- 100 # small because this is a dummy example. opt_chunk_alignment(prj) <- c(-50, -35) # Align such as it creates 2 chunks only. opt_select(prj) <- "xyz" # Read only the coordinates. opt_filter(prj) <- "-keep_first" # Read only first returns. # 4. Apply a user-defined function to take advantage of the internal engine opt <- list(need_buffer = TRUE, # catalog_apply will throw an error if buffer = 0 automerge = TRUE) # catalog_apply will merge the outputs into a single object output <- catalog_apply(prj, my_tree_detection_method, ws = 5, .options = opt) spplot(output) ## Not run: ## =================================================== ## Example 2: compute a rumple index on surface points ## =================================================== rumple_index_surface = function(cluster, res) { las = readLAS(cluster) if (is.empty(las)) return(NULL) las <- filter_surfacepoints(las, 1) rumple <- grid_metrics(las, ~rumple_index(X,Y,Z), res) bbox <- raster::extent(cluster) rumple <- raster::crop(rumple, bbox) return(rumple) } LASfile <- system.file("extdata", "Megaplot.laz", package="lidR") prj <- readLAScatalog(LASfile) opt_chunk_buffer(prj) <- 1 opt_chunk_size(prj) <- 140 # small because this is a dummy example. opt_select(prj) <- "xyz" # read only the coordinates. opt <- list(raster_alignment = 20, # catalog_apply will adjust the chunks if required automerge = TRUE) # catalog_apply will merge the outputs into a single raster output <- catalog_apply(prj, rumple_index_surface, res = 20, .options = opt) plot(output, col = height.colors(50)) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.