Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

sim3

Synthetic dataset used in section 5.1.3 of the reference paper


Description

Dataset used for testing clustering with HMM-VB. The data dimension is 40. The first 10 dimensions were generated from a 3-component Gaussian Mixture Model (GMM). The remaining 30 dimensions were generated from a 5-component GMM. By specific design of the means, covariance matrices and transition probabilities, the data contain 5 distinct clusters. For details see the references.

Usage

sim3

Format

A data frame with 1000 rows and 40 variables. Last column contains ground truth cluster labels.

References

Lin Lin and Jia Li, "Clustering with hidden Markov model on variable blocks," Journal of Machine Learning Research, 18(110):1-49, 2017.


HDclust

Clustering High Dimensional Data with Hidden Markov Model on Variable Blocks

v1.0.3
GPL (>= 2)
Authors
Yevhen Tupikov [aut], Lin Lin [aut], Lixiang Zhang [aut], Jia Li [aut, cre]
Initial release
2019-04-05

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.