Batchwise Backfitting
Batchwise backfitting estimation engine for GAMLSS using very large data sets.
## Batchwise backfitting engine. opt_bbfit(x, y, family, shuffle = TRUE, start = NULL, offset = NULL, epochs = 1, nbatch = 10, verbose = TRUE, ...) bbfit(x, y, family, shuffle = TRUE, start = NULL, offset = NULL, epochs = 1, nbatch = 10, verbose = TRUE, ...) ## Parallel version. opt_bbfitp(x, y, family, mc.cores = 1, ...) ## Loglik contribution plot. contribplot(x, ...)
x |
For function |
y |
The model response, as returned from function |
family |
A bamlss family object, see |
shuffle |
Should observations be shuffled? |
start |
A named numeric vector containing possible starting values, the names are based on
function |
offset |
Can be used to supply model offsets for use in fitting,
returned from function |
epochs |
For how many epochs should the algorithm run? |
nbatch |
Number of batches. Can also be a number between 0 and 1, i.e., determining the fraction of observations that should be used for fitting. |
verbose |
Print information during runtime of the algorithm. |
mc.cores |
On how many cores should estimation be started? |
... |
For |
The algorithm uses batch-wise estimation of smoothing variances, which are estimated on an hold-out batch. This way, models for very large data sets can be estimated. Note, the algorithm only works in combination withe the ff and ffbase package. The data needs to be stored as comma separated file on disc, see the example.
For function bbfit()
a list containing the following objects:
fitted.values |
A named list of the fitted values of the modeled parameters of the selected distribution. |
parameters |
The estimated set regression coefficients and smoothing variances. |
shuffle |
Logical |
runtime |
The runtime of the algorithm. |
## Not run: ## Simulate data. set.seed(123) d <- GAMart(n = 27000, sd = -1) ## Write data to disc. tf <- tempdir() write.table(d, file.path(tf, "d.raw"), quote = FALSE, row.names = FALSE, sep = ",") ## Estimation using batch-wise backfitting. f <- list( num ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10), sigma ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10) ) b <- bamlss(f, data = file.path(tf, "d.raw"), optimizer = opt_bbfit, sampler = FALSE, nbatch = 10, epochs = 2, loglik = TRUE) ## Show estimated effects. plot(b) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.