Permutation-based variable importance
Compute permutation-based variable importance scores for the predictors in a model.
vi_permute(object, ...) ## Default S3 method: vi_permute( object, feature_names = NULL, train = NULL, target = NULL, metric = NULL, smaller_is_better = NULL, type = c("difference", "ratio"), nsim = 1, keep = TRUE, sample_size = NULL, sample_frac = NULL, reference_class = NULL, pred_fun = NULL, pred_wrapper = NULL, verbose = FALSE, progress = "none", parallel = FALSE, paropts = NULL, ... )
object |
A fitted model object (e.g., a |
... |
Additional optional arguments. (Currently ignored.) |
feature_names |
Character string giving the names of the predictor
variables (i.e., features) of interest. If |
train |
A matrix-like R object (e.g., a data frame or matrix)
containing the training data. If |
target |
Either a character string giving the name (or position) of the
target column in |
metric |
Either a function or character string specifying the
performance metric to use in computing model performance (e.g., RMSE for
regression or accuracy for binary classification). If |
smaller_is_better |
Logical indicating whether or not a smaller value
of |
type |
Character string specifying how to compare the baseline and
permuted performance metrics. Current options are |
nsim |
Integer specifying the number of Monte Carlo replications to
perform. Default is 1. If |
keep |
Logical indicating whether or not to keep the individual
permutation scores for all |
sample_size |
Integer specifying the size of the random sample to use
for each Monte Carlo repetition. Default is |
sample_frac |
Proportion specifying the size of the random sample to use
for each Monte Carlo repetition. Default is |
reference_class |
Character string specifying which response category represents the "reference" class (i.e., the class for which the predicted class probabilities correspond to). Only needed for binary classification problems. |
pred_fun |
Deprecated. Use |
pred_wrapper |
Prediction function that requires two arguments,
|
verbose |
Logical indicating whether or not to print information during
the construction of variable importance scores. Default is |
progress |
Character string giving the name of the progress bar to use.
See |
parallel |
Logical indicating whether or not to run |
paropts |
List containing additional options to be passed on to
|
Coming soon!
A tidy data frame (i.e., a "tibble"
object) with two columns:
Variable
and Importance
.
## Not run: # Load required packages library(ggplot2) # for ggtitle() function library(nnet) # for fitting neural networks # Simulate training data trn <- gen_friedman(500, seed = 101) # ?vip::gen_friedman # Inspect data tibble::as_tibble(trn) # Fit PPR and NN models (hyperparameters were chosen using the caret package # with 5 repeats of 5-fold cross-validation) pp <- ppr(y ~ ., data = trn, nterms = 11) set.seed(0803) # for reproducibility nn <- nnet(y ~ ., data = trn, size = 7, decay = 0.1, linout = TRUE, maxit = 500) # Plot VI scores set.seed(2021) # for reproducibility p1 <- vip(pp, method = "permute", target = "y", metric = "rsquared", pred_wrapper = predict) + ggtitle("PPR") p2 <- vip(nn, method = "permute", target = "y", metric = "rsquared", pred_wrapper = predict) + ggtitle("NN") grid.arrange(p1, p2, ncol = 2) # Mean absolute error mae <- function(actual, predicted) { mean(abs(actual - predicted)) } # Permutation-based VIP with user-defined MAE metric set.seed(1101) # for reproducibility vip(pp, method = "permute", target = "y", metric = mae, smaller_is_better = TRUE, pred_wrapper = function(object, newdata) predict(object, newdata) ) + ggtitle("PPR") ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.