A General Framework For Bagging
bag provides a framework for bagging classification or regression models. The user can provide their own functions for model building, prediction and aggregation of predictions (see Details below).
bag(x, ...)
bagControl(
fit = NULL,
predict = NULL,
aggregate = NULL,
downSample = FALSE,
oob = TRUE,
allowParallel = TRUE
)
## Default S3 method:
bag(x, y, B = 10, vars = ncol(x), bagControl = NULL, ...)
## S3 method for class 'bag'
predict(object, newdata = NULL, ...)
## S3 method for class 'bag'
print(x, ...)
## S3 method for class 'bag'
summary(object, ...)
## S3 method for class 'summary.bag'
print(x, digits = max(3, getOption("digits") - 3), ...)
ldaBag
plsBag
nbBag
ctreeBag
svmBag
nnetBagx |
a matrix or data frame of predictors |
... |
arguments to pass to the model function |
fit |
a function that has arguments |
predict |
a function that generates predictions for each sub-model. The function should have #' arguments |
aggregate |
a function with arguments |
downSample |
logical: for classification, should the data set be randomly sampled so that each #' class has the same number of samples as the smallest class? |
oob |
logical: should out-of-bag statistics be computed and the predictions retained? |
allowParallel |
a parallel backend is loaded and available, should the function use it? |
y |
a vector of outcomes |
B |
the number of bootstrap samples to train over. |
vars |
an integer. If this argument is not |
bagControl |
a list of options. |
object |
an object of class |
newdata |
a matrix or data frame of samples for prediction. Note that this argument must have a non-null value |
digits |
minimal number of significant digits. |
An object of class list of length 3.
The function is basically a framework where users can plug in any model in to assess
the effect of bagging. Examples functions can be found in ldaBag, plsBag
, nbBag, svmBag and nnetBag.
Each has elements fit, pred and aggregate.
One note: when vars is not NULL, the sub-setting occurs prior to the fit and #' predict functions are called. In this way, the user probably does not need to account for the #' change in predictors in their functions.
When using bag with train, classification models should use type = "prob" #' inside of the predict function so that predict.train(object, newdata, type = "prob") will #' work.
If a parallel backend is registered, the foreach package is used to train the models in parallel.
bag produces an object of class bag with elements
fits |
a list with two sub-objects: the |
control |
a mirror of the arguments passed into |
call |
the call |
B |
the number of bagging iterations |
dims |
the dimensions of the training set |
Max Kuhn
## A simple example of bagging conditional inference regression trees: data(BloodBrain) ## treebag <- bag(bbbDescr, logBBB, B = 10, ## bagControl = bagControl(fit = ctreeBag$fit, ## predict = ctreeBag$pred, ## aggregate = ctreeBag$aggregate)) ## An example of pooling posterior probabilities to generate class predictions data(mdrr) ## remove some zero variance predictors and linear dependencies mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)] mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .95)] ## basicLDA <- train(mdrrDescr, mdrrClass, "lda") ## bagLDA2 <- train(mdrrDescr, mdrrClass, ## "bag", ## B = 10, ## bagControl = bagControl(fit = ldaBag$fit, ## predict = ldaBag$pred, ## aggregate = ldaBag$aggregate), ## tuneGrid = data.frame(vars = c((1:10)*10 , ncol(mdrrDescr))))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.