Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

rfPermute

Estimate Permutation p-values for Random Forest Importance Metrics


Description

Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed.

Usage

rfPermute(x, ...)

## Default S3 method:
rfPermute(x, y, ..., nrep = 100, num.cores = 1)

## S3 method for class 'formula'
rfPermute(formula, data = NULL, ..., subset, na.action = na.fail, nrep = 100)

Arguments

x, y, formula, data, subset, na.action, ...

See randomForest for definitions.

nrep

Number of permutation replicates to run to construct null distribution and calculate p-values (default = 100).

num.cores

Number of CPUs to distribute permutation results over. Defaults to NULL which uses one fewer than the number of cores reported by detectCores.

Details

All other parameters are as defined in randomForest.formula. A Random Forest model is first created as normal to calculate the observed values of variable importance. The response variable is then permuted nrep times, with a new Random Forest model built for each permutation step.

Value

An rfPermute object which contains all of the components of a randomForest object plus:

null.dist

A list containing two three-dimensional arrays of null distributions for unscaled and scaled importance measures.

pval

A three dimensional array containing permutation p-values for unscaled and scaled importance measures.

Author(s)

See Also

plotNull for plotting null distributions from the rfPermute objects.
rp.importance for extracting importance measures.
rp.combine for combining multiple rfPermute objects.
proximityPlot for plotting case proximities.
impHeatmap for plotting a heatmap of importance scores.
randomForest

Examples

# A regression model using the ozone example
data(airquality)
ozone.rfP <- rfPermute(
  Ozone ~ ., data = airquality, ntree = 100, 
  na.action = na.omit, nrep = 50, num.cores = 1
)
  
# Plot the null distributions and observed values.
plotNull(ozone.rfP) 
  
# Plot the unscaled importance distributions and highlight significant predictors
plot(rp.importance(ozone.rfP, scale = FALSE))
  
# ... and the scaled measures
plot(rp.importance(ozone.rfP, scale = TRUE))

rfPermute

Estimate Permutation p-Values for Random Forest Importance Metrics

v2.1.81
GPL (>= 2)
Authors
Eric Archer [aut, cre]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.