Apply a function to a stream of RecordBatches
As an alternative to calling collect() on a Dataset query, you can
use this function to access the stream of RecordBatches in the Dataset.
This lets you aggregate on each chunk and pull the intermediate results into
a data.frame for further aggregation, even if you couldn't fit the whole
Dataset result in memory.
map_batches(X, FUN, ..., .data.frame = TRUE)
X |
A |
FUN |
A function or |
... |
Additional arguments passed to |
.data.frame |
logical: collect the resulting chunks into a single
|
This is experimental and not recommended for production use.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.