General Interface Seeded KMeans
The difference with traditional Kmeans is that in this method implemented, at initialization, there are as many clusters as the number of classes that exist of the labelled data, the average of the labelled data of a given class
seeded_kmeans(max_iter = 10, method = "euclidean")
max_iter |
maximum iterations in KMeans. Default is 10 |
method |
distance method in KMeans: "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski" |
Sugato Basu, Arindam Banerjee, Raymond Mooney
Semi-supervised clustering by seeding
July 2002
In Proceedings of 19th International Conference on Machine Learning
library(tidyverse) library(caret) library(SSLR) library(tidymodels) data <- iris set.seed(1) #% LABELED cls <- which(colnames(iris) == "Species") labeled.index <- createDataPartition(data$Species, p = .2, list = FALSE) data[-labeled.index,cls] <- NA m <- seeded_kmeans() %>% fit(Species ~ ., data) #Get labels (assing clusters), type = "raw" return factor labels <- m %>% cluster_labels() print(labels) #Get centers centers <- m %>% get_centers() print(centers)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.