Permutation sampling
A permutation sample is the same size as the original data set and is made
by permuting/shuffling one or more columns. This results in analysis
samples where some columns are in their original order and some columns
are permuted to a random order. Unlike other sampling functions in
rsample, there is no assessment set and calling assessment() on a
permutation split will throw an error.
permutations(data, permute = NULL, times = 25, apparent = FALSE, ...)
data |
A data frame. |
permute |
One or more columns to shuffle. This argument supports
|
times |
The number of permutation samples. |
apparent |
A logical. Should an extra resample be added where the analysis is the standard data set. |
... |
Not currently used. |
The argument apparent enables the option of an additional
"resample" where the analysis data set is the same as the original data
set. Permutation-based resampling can be especially helpful for computing
a statistic under the null hypothesis (e.g. t-statistic). This forms the
basis of a permutation test, which computes a test statistic under all
possible permutations of the data.
A tibble with classes permutations, rset, tbl_df, tbl, and
data.frame. The results include a column for the data split objects and a
column called id that has a character string with the resample
identifier.
permutations(mtcars, mpg, times = 2)
permutations(mtcars, mpg, times = 2, apparent = TRUE)
library(purrr)
resample1 <- permutations(mtcars, starts_with("c"), times = 1)
resample1$splits[[1]] %>% analysis()
resample2 <- permutations(mtcars, hp, times = 10, apparent = TRUE)
map_dbl(resample2$splits, function(x) {
t.test(hp ~ vs, data = analysis(x))$statistic
})Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.