vip: vi_firm – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

vi_firm

Variance-based variable importance

Description

Compute variance-based variable importance using a simple feature importance ranking measure (FIRM) approach; for details, see Greenwell et al. (2018) and Scholbeck et al. (2019).

Usage

vi_firm(object, ...)

## Default S3 method:
vi_firm(object, feature_names, FUN = NULL, var_fun = NULL, ice = FALSE, ...)

Arguments

`object`	A fitted model object (e.g., a `"randomForest"` object).
`...`	Additional optional arguments to be passed on to `partial`.
`feature_names`	Character string giving the names of the predictor variables (i.e., features) of interest.
`FUN`	Deprecated. Use `var_fun` instead.
`var_fun`	List with two components, `"cat"` and `"con"`, containing the functions to use to quantify the variability of the feature effects (e.g., partial dependence values) for categorical and continuous features, respectively. If `NULL`, the standard deviation is used for continuous features. For categorical features, the range statistic is used (i.e., (max - min) / 4). Only applies when `method = "firm"`.
`ice`	Logical indicating whether or not to estimate feature effects using individual conditional expectation (ICE) curves. Only applies when `method = "firm"`. Default is `FALSE`. Setting `ice = TRUE` is preferred whenever strong interaction effects are potentially present.

Details

This approach to computing VI scores is based on quantifying the relative "flatness" of the effect of each feature. Feature effects can be assessed using partial dependence plots (PDPs) or individual conditional expectation (ICE) curves. These approaches are model-agnostic and can be applied to any supervised learning algorithm. By default, relative "flatness" is defined by computing the standard deviation of the y-axis values for each feature effect plot for numeric features; for categorical features, the default is to use range divided by 4. This can be changed via the 'var_fun' argument. See Greenwell et al. (2018) for details and additional examples.

Value

A tidy data frame (i.e., a "tibble" object) with two columns, Variable and Importance, containing the variable name and its associated importance score, respectively.

References

Greenwell, B. M., Boehmke, B. C., and McCarthy, A. J. A Simple and Effective Model-Based Variable Importance Measure. arXiv preprint arXiv:1805.04755 (2018).

Scholbeck, C. A. Scholbeck, and Molnar, C., and Heumann C., and Bischl, B., and Casalicchio, G. Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model-Agnostic Interpretations. arXiv preprint arXiv:1904.03959 (2019).

vip

Variable Importance Plots

v0.3.2

GPL (>= 2)

Authors

Brandon Greenwell [aut, cre] (<https://orcid.org/0000-0002-8120-0084>), Brad Boehmke [aut] (<https://orcid.org/0000-0002-3611-8516>), Bernie Gray [aut] (<https://orcid.org/0000-0001-9190-6032>)

Initial release