Undersample a dataset by hierarchical clustering.
Undersample a dataset by hierarchical clustering.
undersample_hclust( data, cls, cls_col, m, k = 5, h = NA, dist_calc = "euclidean" )
data |
Dataset to be undersampled. |
cls |
Majority class that will be undersampled. |
cls_col |
Column in data containing class memberships. |
m |
Number of samples in undersampled dataset. |
k |
Number of clusters to derive from clustering. |
h |
Height at which to cut the clustering tree. |
dist_calc |
Distance calculation method. See |
Undersampled dataframe containing only cls
.
table(iris$Species) undersamp <- undersample_hclust(iris, "setosa", "Species", 15) nrow(undersamp)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.