Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

stratrs

Perform stratified random sampling to balance outcomes


Description

This function is used to perform stratified random sampling to balance outcomes among the shards.

Usage

stratrs(y, C=5, P=0)

Arguments

y

The binary/categorical/continuous outcome.

C

The number of shards to break the data set into.

P

For continuous data, we break the range into P segments via the quantiles. Specifying, P=20 seems to work reasonably well.

Details

To perform BART with large data sets, random sampling is employed to break the data into C shards. Each shard should be balanced with respect to the outcome. For binary/categorical outcomes, stratified random sampling is employed with this function.

Value

A vector is returned with each element assigned to a shard.

See Also

Examples

set.seed(12)
x <- rbinom(25000, 1, 0.1)
a <- stratrs(x)
table(a, x)
z <- pmin(rpois(25000, 0.8), 5)
b <- stratrs(z)
table(b, z)

BART

Bayesian Additive Regression Trees

v2.9
GPL (>= 2)
Authors
Robert McCulloch [aut], Rodney Sparapani [aut, cre], Charles Spanbauer [aut], Robert Gramacy [aut], Matthew Pratola [aut], Martyn Plummer [ctb], Nicky Best [ctb], Kate Cowles [ctb], Karen Vines [ctb]
Initial release
2020-12-21

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.