Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

imputations

Built-in imputation methods.


Description

The built-ins are:

  • imputeConstant(const) for imputation using a constant value,

  • imputeMedian() for imputation using the median,

  • imputeMode() for imputation using the mode,

  • imputeMin(multiplier) for imputing constant values shifted below the minimum using min(x) - multiplier * diff(range(x)),

  • imputeMax(multiplier) for imputing constant values shifted above the maximum using max(x) + multiplier * diff(range(x)),

  • imputeNormal(mean, sd) for imputation using normally distributed random values. Mean and standard deviation will be calculated from the data if not provided.

  • imputeHist(breaks, use.mids) for imputation using random values with probabilities calculated using table or hist.

  • imputeLearner(learner, features = NULL) for imputations using the response of a classification or regression learner.

Usage

imputeConstant(const)

imputeMedian()

imputeMean()

imputeMode()

imputeMin(multiplier = 1)

imputeMax(multiplier = 1)

imputeUniform(min = NA_real_, max = NA_real_)

imputeNormal(mu = NA_real_, sd = NA_real_)

imputeHist(breaks, use.mids = TRUE)

imputeLearner(learner, features = NULL)

Arguments

const

(any)
Constant valued use for imputation.

multiplier

(numeric(1))
Value that stored minimum or maximum is multiplied with when imputation is done.

min

(numeric(1))
Lower bound for uniform distribution. If NA (default), it will be estimated from the data.

max

(numeric(1))
Upper bound for uniform distribution. If NA (default), it will be estimated from the data.

mu

(numeric(1))
Mean of normal distribution. If missing it will be estimated from the data.

sd

(numeric(1))
Standard deviation of normal distribution. If missing it will be estimated from the data.

breaks

(numeric(1))
Number of breaks to use in graphics::hist. If missing, defaults to auto-detection via “Sturges”.

use.mids

(logical(1))
If x is numeric and a histogram is used, impute with bin mids (default) or instead draw uniformly distributed samples within bin range.

learner

(Learner | character(1))
Supervised learner. Its predictions will be used for imputations. If you pass a string the learner will be created via makeLearner. Note that the target column is not available for this operation.

features

(character)
Features to use in learner for prediction. Default is NULL which uses all available features except the target column of the original task.

See Also


mlr

Machine Learning in R

v2.19.0
BSD_2_clause + file LICENSE
Authors
Bernd Bischl [aut] (<https://orcid.org/0000-0001-6002-6980>), Michel Lang [aut] (<https://orcid.org/0000-0001-9754-0393>), Lars Kotthoff [aut], Patrick Schratz [aut, cre] (<https://orcid.org/0000-0003-0748-6624>), Julia Schiffner [aut], Jakob Richter [aut], Zachary Jones [aut], Giuseppe Casalicchio [aut] (<https://orcid.org/0000-0001-5324-5966>), Mason Gallo [aut], Jakob Bossek [ctb] (<https://orcid.org/0000-0002-4121-4668>), Erich Studerus [ctb] (<https://orcid.org/0000-0003-4233-0182>), Leonard Judt [ctb], Tobias Kuehn [ctb], Pascal Kerschke [ctb] (<https://orcid.org/0000-0003-2862-1418>), Florian Fendt [ctb], Philipp Probst [ctb] (<https://orcid.org/0000-0001-8402-6790>), Xudong Sun [ctb] (<https://orcid.org/0000-0003-3269-2307>), Janek Thomas [ctb] (<https://orcid.org/0000-0003-4511-6245>), Bruno Vieira [ctb], Laura Beggel [ctb] (<https://orcid.org/0000-0002-8872-8535>), Quay Au [ctb] (<https://orcid.org/0000-0002-5252-8902>), Martin Binder [ctb], Florian Pfisterer [ctb], Stefan Coors [ctb], Steve Bronder [ctb], Alexander Engelhardt [ctb], Christoph Molnar [ctb], Annette Spooner [ctb]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.