kNN for outlier detection
Ramaswamy et al. proposed the k-nearest neighbors outlier detection method (kNNo). Each point's anomaly score is the distance to its kth nearest neighbor in the data set. Then, all points are ranked based on this distance. The higher an example's score is, the more anomalous it is.
do_knno(data, k, top_n)
data |
Data observations. |
k |
Number of neighbors of a point that we are interested in. |
top_n |
Total number of outliers we are interested in. |
Vector of outliers.
Guillermo Vinue
Ramaswamy, S., Rastogi, R. and Shim, K. Efficient Algorithms for Mining Outliers from Large Data Sets. SIGMOD'00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000, 427-438.
data(mtcars) data <- as.matrix(mtcars) outl <- do_knno(data, 3, 2) outl data[outl,]
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.