micemd: find.defaultMethod – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

find.defaultMethod

Suggestion of conditional imputation models to use accordingly to the incomplete dataset

Description

Provides conditionnal imputation models to use for each column of the incomplete dataset according to the number of clusters, the number of individuals per cluster and the class of the variables.

Usage

find.defaultMethod(don.na, ind.clust, I.small = 7, ni.small = 100, prop.small = 0.4)

Arguments

`don.na`	An incomplete data frame.
`ind.clust`	A scalar indexes the variable corresponding to the cluster indicator.
`I.small`	A scalar that is used as threshold to consider the number of observed clusters (fully observed or partially observed) as small. Default is `I.small=7`.
`ni.small`	A scalar that is used as threshold to consider the number individuals per clusters (with observed values) as small. Default is `ni.small=100`.
`prop.small`	A scalar that is used as threshold to consider the number of small clusters as small. Default is `prop.small=0.4`.

Details

Provides conditionnal imputation models to use for each column of the incomplete dataset according to the number of clusters, the number of individuals per cluster and the class of the variable (Audigier, V. et al 2017). Returned methods can be: 2l.stage.bin (binary), 2l.stage.norm (continuous), 2l.stage.pois (integer), 2l.glm.bin (binary), 2l.glm.norm (continuous), 2l.glm.pois (integer), 2l.jomo (continuous or binary). For a given variable, the method retained is chosen according to the following decision tree:

	----------------------------------	-----------------------------------
	Few observed	clusters
	----------------------------------	-----------------------------------
	Few observed values per cluster	Many observed values per cluster
------------------	------------------------------------------------	-----------------------------------
continuous	2l.glm.norm	2l.stage.norm
binary	2l.glm.bin	2l.stage.bin
integer	2l.glm.pois	2l.stage.pois
------------------	------------------------------------------------	-----------------------------------

	----------------------------------	-----------------------------------
	Many observed	clusters
	----------------------------------	-----------------------------------
	Few observed values per cluster	Many observed values per cluster
------------------	------------------------------------------------	-----------------------------------
continuous	2l.glm.norm	2l.stage.norm
binary	2l.jomo	2l.jomo
integer	2l.glm.pois	2l.stage.pois
------------------	------------------------------------------------	-----------------------------------

For instance, with few observed clusters (i.e. less than I.small), and many observed values per cluster (i.e. less than prop.small clusters with less than ni.small observed values), imputation of a continuous variable according to the method 2l.stage.norm will be suggested.

Value

A vector of strings with length ncol(data).

Author(s)

Vincent Audigier vincent.audigier@cnam.fr

References

Audigier, V., White, I. , Jolani ,S. Debray, T., Quartagno, M., Carpenter, J., van Buuren, S. and Resche-Rigon, M. Multiple imputation for multilevel data with continuous and binary variables (2018). Statistical Science. <doi:10.1214/18-STS646>.

Jolani, S., Debray, T. P. A., Koffijberg, H., van Buuren, S., and Moons, K. G. M. (2015). Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Statistics in Medicine, 34(11):18411863. <doi:10.1002/sim.6451>

Quartagno, M. and Carpenter, J. R. (2016). Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates. Statistics in Medicine, 35(17):2938 2954. <doi:10.1002/sim.6837>

Resche-Rigon, M. and White, I. R. (2016). Multiple imputation by chained equations for systematically and sporadically missing multilevel data. Statistical Methods in Medical Research. To appear. <doi:10.1177/0962280216666564>

Examples

data(CHEM97Na)
  
  ind.clust<-1#index for the cluster variable
  
  #initialisation of the argument predictorMatrix
  predictor.matrix<-mice(CHEM97Na,m=1,maxit=0)$pred
  predictor.matrix[ind.clust,ind.clust]<-0
  predictor.matrix[-ind.clust,ind.clust]<- -2
  predictor.matrix[predictor.matrix==1]<-2
  
  #initialisation of the argument method
  method<-find.defaultMethod(CHEM97Na,ind.clust)
  print(method)
  
  #multiple imputation by chained equations (parallel calculation)
  #res.mice<-mice.par(CHEM97Na, m = 3, predictorMatrix = predictor.matrix, method = method)

micemd

Multiple Imputation by Chained Equations with Multilevel Data

v1.6.0

GPL-2 | GPL-3

Authors

Vincent Audigier [aut, cre] (CNAM MSDMA team), Matthieu Resche-Rigon [aut] (INSERM ECSTRA team)

Initial release

2019-07-09