Role Selection
has_role(), all_predictors(), and all_outcomes() can be used to
select variables in a formula that have certain roles.
Similarly, has_type(), all_numeric(), and all_nominal() are used to
select columns based on their data type. Nominal variables include both
character and factor.
In most cases, the selectors all_numeric_predictors() and
all_nominal_predictors(), which select on role and type, will be the right
approach for users.
See selections for more details.
current_info() is an internal function.
All of these functions have have limited utility outside of column selection in step functions.
has_role(match = "predictor") all_predictors() all_numeric_predictors() all_nominal_predictors() all_outcomes() has_type(match = "numeric") all_numeric() all_nominal() current_info()
match |
A single character string for the query. Exact matching is used (i.e. regular expressions won't work). |
Selector functions return an integer vector.
current_info() returns an environment with objects vars and data.
library(modeldata)
data(biomass)
rec <- recipe(biomass) %>%
update_role(
carbon, hydrogen, oxygen, nitrogen, sulfur,
new_role = "predictor"
) %>%
update_role(HHV, new_role = "outcome") %>%
update_role(sample, new_role = "id variable") %>%
update_role(dataset, new_role = "splitting indicator")
recipe_info <- summary(rec)
recipe_info
# Centering on all predictors except carbon
rec %>%
step_center(all_predictors(), -carbon) %>%
prep(training = biomass) %>%
bake(new_data = NULL)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.