Estimating Mixed Graphical Models
Function to estimate k-degree Mixed Graphical Models via nodewise regression.
mgm(data, type, level, lambdaSeq, lambdaSel, lambdaFolds, lambdaGam, alphaSeq, alphaSel, alphaFolds, alphaGam, k, moderators, ruleReg, weights, threshold, method, binarySign, scale, verbatim, pbar, warnings, saveModels, saveData, overparameterize, thresholdCat, signInfo, ...)
data |
n x p data matrix |
type |
p vector indicating the type of variable for each column in |
level |
p vector indicating the number of categories of each variable. For continuous variables set to 1. |
lambdaSeq |
A sequence of lambdas that should be searched (see also |
lambdaSel |
Specifies the procedure for selecting the tuning parameter controlling the Lq-penalization. The two options are cross validation "CV" and the Extended Bayesian Information Criterion (EBIC) "EBIC". The EBIC performs well in selecting sparse graphs (see Barber and Drton, 2010 and Foygel and Drton, 2014). Note that when also searching the alpha parameter in the elastic net penalty, cross validation should be preferred, as the parameter vector will not necessarily be sparse anymore. The EBIC tends to be a bit more conservative than CV (see Haslbeck and Waldorp, 2016). CV can sometimes not be performed with categorical variables, because |
lambdaFolds |
Number of folds in cross validation if |
lambdaGam |
Hyperparameter gamma in the EBIC if |
alphaSeq |
A sequence of alpha parameters for the elastic net penality in [0,1] that should be searched (see also |
alphaSel |
Specifies the procedure for selecting the alpha parameter in the elastic net penalty. The two options are cross validation "CV" and the Extended Bayesian Information Criterion (EBIC) "EBIC". The EBIC performs well in selecting sparse graphs (see Barber and Drton, 2010 and Foygel and Drton, 2014). Note that when also searching the alpha parameter in the elastic net penalty, cross validation should be preferred, as the parameter vector will not necessarily be sparse anymore. The EBIC tends to be a bit more conservative than CV (see Haslbeck and Waldorp, 2016). CV can sometimes not be performed with categorical variables, because |
alphaFolds |
Number of folds in cross validation if |
alphaGam |
Hyperparameter gamma in the EBIC if |
k |
Order up until including which interactions are included in the model. |
moderators |
Integer vector with elements in |
ruleReg |
Rule used to combine estimates from neighborhood regression. E.g. for pairwise interactions, two estimates (one from regressing A on B and one from B on A) have to combined in one edge parameter. |
weights |
A n vector with weights for observations. |
threshold |
A threshold below which edge-weights are put to zero. This is done in order to guarantee a lower bound on the false-positive rate. |
method |
Estimation method, currently only |
binarySign |
If |
scale |
If |
verbatim |
If |
pbar |
If |
warnings |
If |
saveModels |
If |
saveData |
If |
overparameterize |
If |
thresholdCat |
If |
signInfo |
If |
... |
Additional arguments. |
mgm()
estimates an exponential mixed graphical model as introduced in Yang and colleagies (2014). Note that MGMs are not normalizable for all parameter values. See Chen, Witten & Shojaie (2015) for an overview of when pairwise MGMs are normalizable. To our best knowledge, for MGMs with interactions of order > 2 that include non-categorical variables, the conditions for normalizablity are unknown.
The function returns a list with the following entries:
call |
Contains all provided input arguments. If |
pairwise |
Contains a list with all information about estimated pairwise interactions. |
interactions |
A list with three entries that relate each interaction in the model to all its parameters. This is different to the output provided in |
intercepts |
A list with p entries, which contain the intercept/thresholds for each node in the network. In case a given node is categorical with m categories, there are m thresholds for this variable. |
nodemodels |
A list with p |
Jonas Haslbeck <jonashaslbeck@gmail.com>
Barber, R. F., & Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. Electronic Journal of Statistics, 9(1), 567-607.
Chen S, Witten DM & Shojaie (2015). Selection and estimation for mixed graphical models. Biometrika, 102(1), 47.
Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in neural information processing systems (pp. 604-612).
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1.
Haslbeck, J. M. B., & Waldorp, L. J. (2020). mgm: Estimating time-varying Mixed Graphical Models in high-dimensional Data. Journal of Statistical Software, 93(8), pp. 1-46. DOI: 10.18637/jss.v093.i08
Loh, P. L., & Wainwright, M. J. (2012, December). Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses. In NIPS (pp. 2096-2104).
Yang, E., Baker, Y., Ravikumar, P., Allen, G. I., & Liu, Z. (2014, April). Mixed Graphical Models via Exponential Families. In AISTATS (Vol. 2012, pp. 1042-1050).
## Not run: ## We fit a pairwise and 3-order MGM to the mixed Autism dataset (?autism_data) # 1) Fit Pairwise MGM # Call mgm() fit_k2 <- mgm(data = autism_data$data, type = autism_data$type, level = autism_data$lev, k = 2) # ad most pairwise interacitons # Weighted adjacency matrix fit_k2$pairwise$wadj # Visualize using qgraph() library(qgraph) qgraph(fit_k2$pairwise$wadj, edge.color = fit_k2$pairwise$edgecolor, layout = "spring", labels = autism_data$colnames) # 2) Fit MGM with pairwise & three-way interactions fit_k3 <- mgm(data = autism_data$data, type = autism_data$type, level = autism_data$lev, k = 3) # include all interactions up to including order 3 # List of estimated interactions fit_k3$interactions$indicator # 3) Plot Factor Graph FactorGraph(object = fit_k3, PairwiseAsEdge = FALSE, labels = autism_data$colnames) # 4) Predict values pred_obj <- predict(fit_k3, autism_data$data) head(pred_obj$predicted) # first six rows of predicted values pred_obj$errors # Nodewise errors ## Here we illustrate why we need to overparameterize the design matrix to ## recover higher order interactions including categorical variables # 1) Define Graph (one 3-way interaction between 3 binary variables) # a) General Graph Info type = c("c", "c", "c") level = c(2, 2, 2) # b) Define Interaction factors <- list() factors[[1]] <- NULL # no pairwise interactions factors[[2]] <- matrix(c(1,2,3), ncol=3, byrow = T) # one 3-way interaction interactions <- list() interactions[[1]] <- NULL interactions[[2]] <- vector("list", length = 1) # threeway interaction no1 interactions[[2]][[1]] <- array(0, dim = c(level[1], level[2], level[3])) theta <- .7 interactions[[2]][[1]][1, 1, 1] <- theta #weight theta for conf (1,1,1), weight 0 for all others # c) Define Thresholds thresholds <- list() thresholds[[1]] <- c(0, 0) thresholds[[2]] <- c(0, 0) thresholds[[3]] <- c(0, 0) # 2) Sample from Graph iter <- 1 set.seed(iter) N <- 2000 d_iter <- mgmsampler(factors = factors, interactions = interactions, thresholds = thresholds, type = type, level = level, N = N, nIter = 50, pbar = TRUE) # 3.1) Estimate order 3 MGM usind standard parameterization d_est_stand <- mgm(data = d_iter$data, type = type, level = level, k = 3, lambdaSel = "CV", ruleReg = "AND", pbar = TRUE, overparameterize = FALSE, signInfo = FALSE) # We look at the nodewise regression for node 1 (same for all) coefs_stand <- d_est_stand$nodemodels[[1]]$model coefs_stand # We see that nonzero-zero pattern of parameter vector does not allow us to infer whether # interactions are present or not # 3.2) Estimate order 3 MGM usind overparameterization d_est_over <- mgm(data = d_iter$data, type = type, level = level, k = 3, lambdaSel = "CV", ruleReg = "AND", pbar = TRUE, overparameterize = TRUE, signInfo = FALSE) # We look at the nodewise regression for node 1 (same for all) coefs_over <- d_est_over$nodemodels[[1]]$model coefs_over # recovers exactly the 3-way interaction # For more examples see https://github.com/jmbh/mgmDocumentation ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.