mgm: tvmgm – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

tvmgm

Estimating time-varying Mixed Graphical Models

Description

Estimates time-varying k-order Mixed Graphical Models (MGMs) via elastic-net regularized kernel smoothed Generalized Linear Models

Usage

tvmgm(data, type, level, timepoints, estpoints, bandwidth, ...)

Arguments

`data`	n x p data matrix.
`type`	p vector indicating the type of variable for each column in `data`: "g" for Gaussian, "p" for Poisson, "c" for categorical.
`level`	p vector indicating the number of categories of each variable. For continuous variables set to 1.
`timepoints`	A strictly increasing numeric vector of length `nrow(data)` indicating the time points of the measurements in `data`. If `timepoints` is not specified, it is assumed that the time points are equally spaced. For details, see Haslbeck and Waldorp (2018).
`estpoints`	Vector indicating estimation points on the unit interval [0, 1] (the provided time scale is normalized interally to [0,1]).
`bandwidth`	We use a gaussian density on the unit time-interval [0,1] to determine the weights for each observation at each estimated time point. The bandwidth specifies the standard deviation the Gaussian density. To get some intuition, which bandwidth results in the combination of how many data close in time one can plot Gaussians on [0,1] for different bandwidths. The bandwidth can also be selected in a data driven way using the function (see `bwSelect`).
`...`	Arguments passed to `mgm`, specifying the MGM. See `?mgm`.

Details

Estimates a sequence of MGMs at the time points specified at the locations specified via estpoints. tvmgm() is a wrapper around mgm() and estimates a series of MGM with different weightings which are defined by the estimation locations in estpoints and the bandwidth parameter specified in bandwidth. For details see Haslbeck and Waldorp (2018).

Note that MGMs are not normalizable for all parameter values. See Chen, Witten & Shojaie (2015) for an overview of when pairwise MGMs are normalizable. To our best knowledge, for MGMs with interactions of order > 2 that include non-categorical variables, the conditions for normalizablity are unknown.

Value

A list with the following entries:

`call`	Contains all provided input arguments. If `saveData = TRUE`, it also contains the data.
`pairwise`	Contains a list with all information about estimated pairwise interactions. `wadj` contains a p x p x estpoints array containing the weighted adjacency matrix for each estimation point specified in `estpoints`, if p is the number of variables in the network. `signs` has the same dimensions as `wadj` and contains the signs for the entries of `wadj`: 1 indicates a positive sign, -1 a negative sign and 0 an undefined sign. A sign is undefined if an edge is a function of more than one parameter. This is the case for interactions involving a categorical variable with more than 2 categories. `edgecolor` also has the same dimensions as `wadj` contains a color for each edge, depending on `signs`. It is provided for more convenient plotting. If only pairwise interactions are modeled (`k = 2`), `wadj` contains all conditional independence relations.
`interactions`	Contains a list with one entry for each estimation point specified in `estpoints`; each entry is a list with three entries that relate each interaction in the model to all its parameters. `indicator` contains a list with k-1 entries, one for each order of modeled interaction, which contain the estimated (nonzero) interactions. `weights` contains a list with k-1 entries, which in turn contain R lists, where R is the number of interactions (and rows in the corresponding list entry in`indicator`) that were estimated (nonzero) in the given entry. `signs` has the same structure as `weights` and provides the sign of the interaction, if defined.
`intercepts`	Contains a list with one entry for each estimation point specified in `estpoints`; each entry is a list with p entries, which contain the intercept/thresholds for each node in the network. In case a given node is categorical with m categories, there are m thresholds for this variable (one for each category).
`tvmodels`	Contains the MGM model estimated by `mgm()` at each time point specified via `estpoints`. See `?mgm` for a detailed description of this output.

Author(s)

Jonas Haslbeck <jonashaslbeck@gmail.com>

References

Chen S, Witten DM & Shojaie (2015). Selection and estimation for mixed graphical models. Biometrika, 102(1), 47.

Haslbeck, J. M. B., & Waldorp, L. J. (2020). mgm: Estimating time-varying Mixed Graphical Models in high-dimensional Data. Journal of Statistical Software, 93(8), pp. 1-46. DOI: 10.18637/jss.v093.i08

Yang, E., Baker, Y., Ravikumar, P., Allen, G. I., & Liu, Z. (2014, April). Mixed Graphical Models via Exponential Families. In AISTATS (Vol. 2012, pp. 1042-1050).

Examples

## Not run: 


## We specify a time-varying MGM and recover it using tvmgm()

# 1) Specify Model

# a) Define Graph
p <- 6
type = c("c", "c", "g", "g", "p", "p")
level = c(2, 3, 1, 1, 1, 1)
n_timepoints <- 1000

# b) Define Interaction
factors <- list()
factors[[1]] <- matrix(c(1,2,
                         2,3,
                         3,4), ncol=2, byrow = T)  # no pairwise interactions
factors[[2]] <- matrix(c(1,2,3,
                         2,3,4), ncol=3, byrow = T) # one 3-way interaction

interactions <- list()
interactions[[1]] <- vector("list", length = 3)
interactions[[2]] <- vector("list", length = 2)
# 3 2-way interactions
interactions[[1]][[1]] <- array(0, dim = c(level[1], level[2], n_timepoints))
interactions[[1]][[2]] <- array(0, dim = c(level[2], level[3], n_timepoints))
interactions[[1]][[3]] <- array(0, dim = c(level[3], level[4], n_timepoints))
# 2 3-way interactions
interactions[[2]][[1]] <- array(0, dim = c(level[1], level[2], level[3], n_timepoints))
interactions[[2]][[2]] <- array(0, dim = c(level[2], level[3], level[4], n_timepoints))
theta <- .3
interactions[[1]][[1]][1, 1, ] <- theta
interactions[[1]][[2]][1, 1, ] <- theta
interactions[[1]][[3]][1, 1, ] <- seq(0, theta, length = n_timepoints)
interactions[[2]][[1]][1, 1, 1, ] <- theta
interactions[[2]][[2]][1, 1, 1, ] <- theta
# c) Define Thresholds
thresholds <- list()
thresholds[[1]] <- matrix(0, nrow = n_timepoints, ncol= level[1])
thresholds[[2]] <- matrix(0, nrow = n_timepoints, ncol= level[2])
thresholds[[3]] <- matrix(0, nrow = n_timepoints, ncol= level[3])
thresholds[[4]] <- matrix(0, nrow = n_timepoints, ncol= level[4])
thresholds[[5]] <- matrix(.1, nrow = n_timepoints, ncol= level[5])
thresholds[[6]] <- matrix(.1, nrow = n_timepoints, ncol= level[6])
# d) define sds
sds <- matrix(.2, ncol=p, nrow=n_timepoints)

# 2) Sample Data
set.seed(1)
d_iter <- tvmgmsampler(factors = factors,
                       interactions = interactions,
                       thresholds = thresholds,
                       sds = sds,
                       type = type,
                       level = level,
                       nIter = 100,
                       pbar = TRUE)

data <- d_iter$data
head(data)
# delete inf rows:
ind_finite <- apply(data, 1, function(x) if(all(is.finite(x))) TRUE else FALSE)
table(ind_finite) # all fine for this setup & seed
# in case of inf values (no theory on how to keep k-order MGM well-defined)
data <- data[ind_finite, ] 


# 3) Recover
mgm_c_cv <- tvmgm(data = data,
                  type = type,
                  level = level,
                  k = 3,
                  estpoints = seq(0, 1, length=10),
                  bandwidth = .1,
                  lambdaSel = "CV",
                  ruleReg = "AND",
                  pbar = TRUE,
                  overparameterize = T,
                  signInfo = FALSE)

# Look at time-varying pairwise parameter 3-4
mgm_c_cv$pairwise$wadj[3,4,] # recovers increase

# 4) Predict values / compute nodewise Errors
pred_mgm_cv_w <- predict.mgm(mgm_c_cv,
                             data = data,
                             tvMethod = "weighted")
pred_mgm_cv_cM <- predict.mgm(mgm_c_cv,
                              data = data,
                              tvMethod = "closestModel")

pred_mgm_cv_w$errors
pred_mgm_cv_cM$errors # Pretty similar!


# For more examples see https://github.com/jmbh/mgmDocumentation


## End(Not run)

mgm

Estimating Time-Varying k-Order Mixed Graphical Models

v1.2-11

GPL (>= 2)

Authors

Jonas Haslbeck

Initial release

2021-02-02