Split numeric variables into smaller groups
Recode numeric variables into equal sized groups, i.e. a
variable is cut into a smaller number of groups at specific cut points.
split_var_if()
is a scoped variant of split_var()
, where
transformation will be applied only to those variables that match the
logical condition of predicate
.
split_var( x, ..., n, as.num = FALSE, val.labels = NULL, var.label = NULL, inclusive = FALSE, append = TRUE, suffix = "_g" ) split_var_if( x, predicate, n, as.num = FALSE, val.labels = NULL, var.label = NULL, inclusive = FALSE, append = TRUE, suffix = "_g" )
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
n |
The new number of groups that |
as.num |
Logical, if |
val.labels |
Optional character vector, to set value label attributes
of recoded variable (see vignette Labelled Data and the sjlabelled-Package).
If |
var.label |
Optional string, to set variable label attribute for the
returned variable (see vignette Labelled Data and the sjlabelled-Package).
If |
inclusive |
Logical; if |
append |
Logical, if |
suffix |
Indicates which suffix will be added to each dummy variable.
Use |
predicate |
A predicate function to be applied to the columns. The
variables for which |
split_var()
splits a variable into equal sized groups, where
the amount of groups depends on the n
-argument. Thus, this
functions cuts
a variable into groups at the specified
quantiles
.
By contrast, group_var
recodes a variable into groups, where
groups have the same value range (e.g., from 1-5, 6-10, 11-15 etc.).
split_var()
also works on grouped data frames
(see group_by
). In this case, splitting is applied to
the subsets of variables in x
. See 'Examples'.
A grouped variable with equal sized groups. If x
is a data frame,
for append = TRUE
, x
including the grouped variables as new
columns is returned; if append = FALSE
, only the grouped variables
will be returned. If append = TRUE
and suffix = ""
,
recoded variables will replace (overwrite) existing variables.
In case a vector has only few number of unique values, splitting into
equal sized groups may fail. In this case, use the inclusive
-argument
to shift a value at the cut point into the lower, preceeding group to
get equal sized groups. See 'Examples'.
data(efc) # non-grouped table(efc$neg_c_7) # split into 3 groups table(split_var(efc$neg_c_7, n = 3)) # split multiple variables into 3 groups split_var(efc, neg_c_7, pos_v_4, e17age, n = 3, append = FALSE) frq(split_var(efc, neg_c_7, pos_v_4, e17age, n = 3, append = FALSE)) # original table(efc$e42dep) # two groups, non-inclusive cut-point # vector split leads to unequal group sizes table(split_var(efc$e42dep, n = 2)) # two groups, inclusive cut-point # group sizes are equal table(split_var(efc$e42dep, n = 2, inclusive = TRUE)) # Unlike dplyr's ntile(), split_var() never splits a value # into two different categories, i.e. you always get a clean # separation of original categories library(dplyr) x <- dplyr::ntile(efc$neg_c_7, n = 3) table(efc$neg_c_7, x) x <- split_var(efc$neg_c_7, n = 3) table(efc$neg_c_7, x) # works also with gouped data frames mtcars %>% split_var(disp, n = 3, append = FALSE) %>% table() mtcars %>% group_by(cyl) %>% split_var(disp, n = 3, append = FALSE) %>% table()
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.