Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

rfsrc.fast

Fast Random Forests


Description

Fast approximate random forests using subsampling with forest options set to encourage computational speed. Applies to all families.

Usage

rfsrc.fast(formula, data,
  ntree = 500,
  nsplit = 10,
  bootstrap = "by.root",
  ensemble = "oob",
  sampsize = function(x){min(x * .632, max(150, x ^ (3/4)))},
  samptype = "swor",
  samp = NULL,
  ntime = 50,
  forest = FALSE,
  ...)

Arguments

formula

A symbolic description of the model to be fit. If missing, unsupervised splitting is implemented.

data

Data frame containing the y-outcome and x-variables.

ntree

Number of trees.

nsplit

Non-negative integer value specifying number of random split points used to split a node (deterministic splitting corresponds to the value zero and is much slower).

bootstrap

Bootstrap protocol used in growing a tree.

ensemble

Specifies the type of ensemble. We request only out-of-sample which corresponds to "oob".

sampsize

Function specifying size of subsampled data. Can also be a number.

samptype

Type of bootstrap used.

samp

Bootstrap specification when "by.user" is used.

ntime

Integer value used for survival to constrain ensemble calculations to a grid of ntime time points.

forest

Should the forest object be returned? Turn this on if you want prediction on test data but for big data this can be large.

...

Further arguments to be passed to rfsrc.

Details

Calls rfsrc under various options (including subsampling) to encourage computational speeds. This will provide a good approximation but will not be as good as default settings of rfsrc.

Value

An object of class (rfsrc, grow).

Author(s)

Hemant Ishwaran and Udaya B. Kogalur

See Also

Examples

## ------------------------------------------------------------
## Iowa housing regression example
## ------------------------------------------------------------

## load the Iowa housing data
data(housing, package = "randomForestSRC")

## do quick and *dirty* imputation
housing <- impute(SalePrice ~ ., housing,
         ntree = 50, nimpute = 1, splitrule = "random")

## grow a fast forest
o1 <- rfsrc.fast(SalePrice ~ ., housing)
o2 <- rfsrc.fast(SalePrice ~ ., housing, nodesize = 1)
print(o1)
print(o2)

## grow a fast bivariate forest
o3 <- rfsrc.fast(cbind(SalePrice,Overall.Qual) ~ ., housing)
print(o3)

## ------------------------------------------------------------
## White wine classification example
## ------------------------------------------------------------

data(wine, package = "randomForestSRC")
wine$quality <- factor(wine$quality)
o <- rfsrc.fast(quality ~ ., wine)
print(o)


## ------------------------------------------------------------
## pbc survival example
## ------------------------------------------------------------

data(pbc, package = "randomForestSRC")
o <- rfsrc.fast(Surv(days, status) ~ ., pbc)
print(o)

## ------------------------------------------------------------
## WIHS competing risk example
## ------------------------------------------------------------

data(wihs, package = "randomForestSRC")
o <- rfsrc.fast(Surv(time, status) ~ ., wihs)
print(o)

randomForestSRC

Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

v2.11.0
GPL (>= 3)
Authors
Hemant Ishwaran <hemant.ishwaran@gmail.com>, Udaya B. Kogalur <ubk@kogalur.com>
Initial release
2021-03-30

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.