Missing Value Imputation with kNN
Imputes missing values in a matrix composed of categorical variables using k Nearest Neighbors.
knncatimpute(x, dist = NULL, nn = 3, weights = TRUE)
x |
a numeric matrix containing missing values. All non-missing values
must be integers between 1 and n.cat, where n.cat
is the maximum number of levels the categorical variables in |
dist |
either a character string naming the distance measure or a distance matrix.
If the former, |
nn |
an integer specifying k, i.e.\ the number of nearest neighbors, used in the imputation of the missing values. |
weights |
should weighted kNN be used to impute the missing values? If |
A matrix of the same size as x
in which all the missing values have been imputed.
Holger Schwender, holger.schwender@udo.edu
Schwender, H.\ (2007). Statistical Analysis of Genotype and Gene Expression Data. Dissertation, Department of Statistics, University of Dortmund.
## Not run: # Generate a data set consisting of 200 rows and 50 columns # in which the values are integers between 1 and 3. # Afterwards, remove 20 of the values randomly. mat <- matrix(sample(3, 10000, TRUE), 200) mat[sample(10000, 20)] <- NA # Replace the missing values. mat2 <- knncatimpute(mat) # Replace the missing values using the 5 nearest neighbors # and Cohen's Kappa. mat3 <- knncatimpute(mat, nn = 5, dist = "cohen") ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.