Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.
Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.
undersample_mindist(data, cls, cls_col, m, dist_calc = "euclidean")
data |
Dataset to undersample. Aside from |
cls |
Class to be undersampled. |
cls_col |
Column containing class information. |
m |
Desired number of observations after undersampling. |
dist_calc |
Method for distance calculation. See |
An undersampled dataframe.
setosa <- iris[iris$Species == "setosa", ] nrow(setosa) undersamp <- undersample_mindist(setosa, "setosa", "Species", 50) nrow(undersamp)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.