The p-value computation for the test of independence using a fixed partition size
The p-value computation for the distribution free test of independence between two univariate random variables of Heller et al. (2016) ,using a fixed partition size m
.
hhg.univariate.ind.pvalue(statistic, NullTable, m=min(statistic$mmax,4),l=m)
statistic |
The value of the computed statistic by the function |
NullTable |
The null table of the statistic, which can be downloaded from the software website (http://www.math.tau.ac.il/~ruheller/Software.html) or computed by the function
|
m |
The partition size. |
l |
For |
For the test statistic, the function extracts the fraction of observations in the null table that are at least as large as the test statistic, i.e. the p-value.
For 'DDP'
, 'ADP'
and 'ADP-EQP'
variants, the partition size is described by a single parameter m
(since partition size is m X m). For 'ADP-ML'
and 'ADP-EQP-ML'
variants, partition sizes of data are of sizes m X l, allowing for assymetric tables.
The p-value.
Barak Brill and Shachar Kaufman.
Heller, R., Heller, Y., Kaufman S., Brill B, & Gorfine, M. (2016). Consistent Distribution-Free K-Sample and Independence Tests for Univariate Random Variables, JMLR 17(29):1-54
Brill B. (2016) Scalable Non-Parametric Tests of Independence (master's thesis)
http://primage.tau.ac.il/libraries/theses/exeng/free/2899741.pdf
## Not run: N = 35 data = hhg.example.datagen(N, 'Parabola') X = data[1,] Y = data[2,] plot(X,Y) #I) Computing test statistics , with default parameters: #statistic: hhg.univariate.ADP.Likelihood.result = hhg.univariate.ind.stat(X,Y) hhg.univariate.ADP.Likelihood.result #null table: ADP.null = hhg.univariate.ind.nulltable(N) #pvalue: hhg.univariate.ind.pvalue(hhg.univariate.ADP.Likelihood.result, ADP.null) #II) Computing test statistics , with summation over Data Derived Partitions (DDP), #using Pearson scores, and partition sizes up to 5: #statistic: hhg.univariate.DDP.Pearson.result = hhg.univariate.ind.stat(X,Y,variant = 'DDP', score.type = 'Pearson', mmax = 5) hhg.univariate.DDP.Pearson.result #null table: DDP.null = hhg.univariate.ind.nulltable(N,mmax = 5,variant = 'DDP', score.type = 'Pearson', nr.replicates = 1000) #pvalue , for different partition size: hhg.univariate.ind.pvalue(hhg.univariate.DDP.Pearson.result, DDP.null, m =2) hhg.univariate.ind.pvalue(hhg.univariate.DDP.Pearson.result, DDP.null, m =5) #III) computing P-value for the variants used for large N: N_Large = 1000 data_Large = hhg.example.datagen(N_Large, 'W') X_Large = data_Large[1,] Y_Large = data_Large[2,] plot(X_Large,Y_Large) NullTable_ADP_EQP = hhg.univariate.ind.nulltable(N_Large, variant = 'ADP-EQP', nr.atoms = 30,nr.replicates=200) NullTable_ADP_EQP_ML = hhg.univariate.ind.nulltable(N_Large, variant = 'ADP-EQP-ML',nr.atoms = 30,nr.replicates=200) ADP_EQP_result = hhg.univariate.ind.stat(X_Large,Y_Large,variant = 'ADP-EQP', nr.atoms =30) ADP_EQP_ML_result = hhg.univariate.ind.stat(X_Large,Y_Large,variant='ADP-EQP-ML', nr.atoms = 30) #P-value for the S_(5X5) statistic, the sum over all 5X5 partitions: hhg.univariate.ind.pvalue(ADP_EQP_result,NullTable_ADP_EQP,m=5 ) #P-value for the S_(5X3) statistic, the sum over all 5X3 partitions: hhg.univariate.ind.pvalue(ADP_EQP_ML_result,NullTable_ADP_EQP_ML,m=5,l=3) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.