Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

oversample

Over- or undersample binary classification task to handle class imbalancy.


Description

Oversampling: For a given class (usually the smaller one) all existing observations are taken and copied and extra observations are added by randomly sampling with replacement from this class.

Undersampling: For a given class (usually the larger one) the number of observations is reduced (downsampled) by randomly sampling without replacement from this class.

Usage

oversample(task, rate, cl = NULL)

undersample(task, rate, cl = NULL)

Arguments

task

(Task)
The task.

rate

(numeric(1))
Factor to upsample or downsample a class. For undersampling: Must be between 0 and 1, where 1 means no downsampling, 0.5 implies reduction to 50 percent and 0 would imply reduction to 0 observations. For oversampling: Must be between 1 and Inf, where 1 means no oversampling and 2 would mean doubling the class size.

cl

(character(1))
Which class should be over- or undersampled. If NULL, oversample will select the smaller and undersample the larger class.

Value

See Also


mlr

Machine Learning in R

v2.19.0
BSD_2_clause + file LICENSE
Authors
Bernd Bischl [aut] (<https://orcid.org/0000-0001-6002-6980>), Michel Lang [aut] (<https://orcid.org/0000-0001-9754-0393>), Lars Kotthoff [aut], Patrick Schratz [aut, cre] (<https://orcid.org/0000-0003-0748-6624>), Julia Schiffner [aut], Jakob Richter [aut], Zachary Jones [aut], Giuseppe Casalicchio [aut] (<https://orcid.org/0000-0001-5324-5966>), Mason Gallo [aut], Jakob Bossek [ctb] (<https://orcid.org/0000-0002-4121-4668>), Erich Studerus [ctb] (<https://orcid.org/0000-0003-4233-0182>), Leonard Judt [ctb], Tobias Kuehn [ctb], Pascal Kerschke [ctb] (<https://orcid.org/0000-0003-2862-1418>), Florian Fendt [ctb], Philipp Probst [ctb] (<https://orcid.org/0000-0001-8402-6790>), Xudong Sun [ctb] (<https://orcid.org/0000-0003-3269-2307>), Janek Thomas [ctb] (<https://orcid.org/0000-0003-4511-6245>), Bruno Vieira [ctb], Laura Beggel [ctb] (<https://orcid.org/0000-0002-8872-8535>), Quay Au [ctb] (<https://orcid.org/0000-0002-5252-8902>), Martin Binder [ctb], Florian Pfisterer [ctb], Stefan Coors [ctb], Steve Bronder [ctb], Alexander Engelhardt [ctb], Christoph Molnar [ctb], Annette Spooner [ctb]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.