Data generator for intrinsic dimension estimation.
gendata
generates various artificial datasets for intrinsic dimension estimation experiments.
gendata(DataName = "SwissRoll", n = 300, p = NULL, noise = NULL, ol = NULL, curv = 1, seed = 123, sorted = FALSE)
DataName |
Name of dataset, one of the following:
|
n |
number of data points to be generated. |
p |
ambient dimension of the dataset. |
noise |
parameter to control noise level in the dataset. In many cases,
it is used for |
ol |
percentage of outliers, i.e., n * ol outliers are added to the generated dataset. |
curv |
a parameter to control the complexity of the embedded manifold. |
seed |
random number seed. |
sorted |
logical. If |
This function generates various artificial datasets often used in
manifold learning and dimension estimation researches.
For some datasets, complexity of the shape is controlled by the parameter curv
.
The parameters noise
and outlier
are used for adding noise and/or
outliers for the dataset.
Data matrix. For ldbl
dataset, it outputs a list composed of
x
: data matrix and tDim
: true intrinsic dimension for each point.
Hideitsu Hino hideitsu.hino@gmail.com
## global intrinsic dimension estimate x <- gendata(DataName='SwissRoll') estmle <- lbmle(x=x,k1=3,k2=5) print(estmle) ## local intrinsic dimension estimate tmp <- gendata(DataName='ldbl',n=1000) x <- tmp$x estmada <- mada(x=x,local=TRUE) head(estmada) ## estimated local intrinsic dimensions head(tmp$tDim) ## true local intrinsic dimensions
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.