rags2ridges: ridgePchordal – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

ridgePchordal

Ridge estimation for high-dimensional precision matrices with known chordal support

Description

Calculates various ridge estimators for high-dimensional precision matrices with known support. This support should form a chordal graph. If the provided support is not chordal, the function makes it so.

Usage

ridgePchordal(
  S,
  lambda,
  zeros,
  cliques = list(),
  separators = list(),
  target = default.target(S),
  type = "Alt",
  optimizer = "nlm",
  grad = FALSE,
  verbose = TRUE,
  ...
)

Arguments

`S`	Sample covariance `matrix`.
`lambda`	A `numeric` representing the value of the penalty parameter.
`zeros`	`Matrix` with indices of entries of the adjacency matrix that are zero. The matrix comprises two columns, each row corresponding to an entry of the adjacency matrix. The first column contains the row indices and the second the column indices. The specified graph should be undirected and decomposable. If not, use the `support4ridgeP` to symmetrize and triangulate. This is done automatically if `cliques` and `separators` arguments are empty lists (and the then employed `zeros`-object may differ from the one provided as input).
`cliques`	A `list`-object containing the node indices per clique as obtained from the `support4ridgeP`-function.
`separators`	A `list`-object containing the node indices per separator as obtained from the `support4ridgeP`-function.
`target`	A target `matrix` (in precision terms) for Type I ridge estimators.
`type`	A `character` indicating the type of ridge estimator to be used. Must be one of: `Alt` (default), `ArchI`, `ArchII`.
`optimizer`	A `character` (either `nlm` (default) or `optim`) specifying which optimization function should be used: `nlm` (default) or `optim`?
`grad`	A `logical` indicator: should, next to the precision matrix estimate, also the gradient be returned?
`verbose`	A `logical` indicator: should intermediate output be printed on the screen?
`...`	Additional arguments passed on to either `nlm` or `optim`.

Details

Sister function to the ridgeP-function, incorporating a chordal zero structure of the precision matrix.

The loss function for type="ArchII" is:

\log(| \mathbf{Ω} |) - \mbox{tr}( \mathbf{S} \mathbf{Ω} ) + λ \big\{ \log(| \mathbf{Ω} |) - \mbox{tr}[ (\mathbf{S} + (1+λ) \mathbf{I}_{p \times p}) \mathbf{Ω} ] \big\}.

For type="ArchI" it is:

(1-λ) \big[ \log(| \mathbf{Ω} |) - \mbox{tr}( \mathbf{S} \mathbf{Ω} ) \big] + λ \big[ \log(| \mathbf{Ω} |) - \mbox{tr}( \mathbf{Ω} ) \big],

which is obtained from:

\log(| \mathbf{Ω} |) - \mbox{tr}( \mathbf{S} \mathbf{Ω} ) + ν \big[ \log(| \mathbf{Ω} |) - \mbox{tr}( \mathbf{Ω} ) \big]

by division of (1+ν) and writing λ = ν / (1 + ν).

An explicit expression for the minimizer of the loss functions implied by the archetypal ridge estimators (type="ArchI" and type="ArchII") exists. For the simple case in which the graph decomposes into cliques \mathcal{C}_1, \mathcal{C}_2 and separator \mathcal{S} the estimator is:

\widehat{\mathbf{Ω}} = ≤ft( \begin{array}{lll} \, [\widehat{\mathbf{Ω}}^{({\mathcal{C}_1})}]_{\mathcal{C}_1 \setminus \mathcal{S}, \mathcal{C}_1 \setminus \mathcal{S}} & [\widehat{\mathbf{Ω}}^{({\mathcal{C}_1})}]_{\mathcal{C}_1 \setminus \mathcal{S}, \mathcal{S}} & \mathbf{0}_{|\mathcal{C}_1 \setminus \mathcal{S}| \times |\mathcal{C}_2 \setminus \mathcal{S}|} \\ \, [\widehat{\mathbf{Ω}}^{(\mathcal{C}_1)}]_{\mathcal{S}, \mathcal{C}_1 \setminus \mathcal{S}} & [\widehat{\mathbf{Ω}}^{({\mathcal{C}_1})}]_{\mathcal{S}, \mathcal{S}} + [\widehat{\mathbf{Ω}}^{({\mathcal{C}_2})}]_{\mathcal{S}, \mathcal{S}} - \widehat{\mathbf{Ω}}^{(\mathcal{S})} & [\widehat{\mathbf{Ω}}^{({\mathcal{C}_2})}]_{\mathcal{S}, \mathcal{C}_2 \setminus \mathcal{S}} \\ \, \mathbf{0}_{|\mathcal{C}_2 \setminus \mathcal{S}| \times |\mathcal{C}_1 \setminus \mathcal{S}|} & [\widehat{\mathbf{Ω}}^{({\mathcal{C}_2})}]_{\mathcal{C}_2 \setminus \mathcal{S}, \mathcal{S}} & [\widehat{\mathbf{Ω}}^{({\mathcal{C}_2})}]_{\mathcal{C}_2 \setminus \mathcal{S}, \mathcal{C}_2 \setminus \mathcal{S}} \end{array} \right),

where \widehat{\mathbf{Ω}}^{({\mathcal{C}_1})}, \widehat{\mathbf{Ω}}^{({\mathcal{C}_1})} and \widehat{\mathbf{Ω}}^{({\mathcal{S}})} are the marginal ridge ML covariance estimators for cliques \mathcal{C}_1, \mathcal{C}_2 and separator \mathcal{S}. The general form of the estimator, implemented here, is analogous to that provided in Proposition 5.9 of Lauritzen (2004). The proof that this estimator indeed optimizes the corresponding loss function is fully analogous to that of Proposition 5.6 of Lauritzen (2004).

In case, type="Alt" no explicit expression of the maximizer of the ridge penalized log-likelihood exists. However, an initial estimator analogous to that for type="ArchI" and type="ArchII" can be defined. In various boundary cases (λ=0, λ=∞, and \mathcal{S} = \emptyset) this initial estimator actually optimizes the loss function. In general, however, it does not. Nevertheless, it functions as well-educated guess for any Newton-like optimization method: convergence is usually achieved quickly. The Newton-like procedure optimizes an unconstrained problem equivalent to that of the penalized log-likelihood with known zeros for the precision matrix (see Dahl et al., 2005 for details).

Value

If grad=FALSE, the function returns a regularized precision matrix with specified chordal sparsity structure.

If grad=TRUE, a list is returned comprising of i) the estimated precision matrix, and ii) the gradients at the initial and at the optimal (if reached) value. The gradient is returned and it can be checked whether it is indeed (close to) zero at the optimum.

Author(s)

Wessel N. van Wieringen.

References

Dahl, J., Roychowdhury, V., Vandenberghe, L. (2005), "Maximum likelihood estimation of Gaussian graphical models: numerical implementation and topology selection", Technical report, UCLA, 2005.

Lauritzen, S.L. (2004). Graphical Models. Oxford University Press.

Miok, V., Wilting, S.M., Van Wieringen, W.N. (2016), "Ridge estimation of the VAR(1) model and its time series chain graph from multivariate time-course omics data", Biometrical Journal, 59(1), 172-191.

Examples

# obtain some (high-dimensional) data
p <- 8
n <- 100
set.seed(333)
Y <- matrix(rnorm(n*p), nrow = n, ncol = p)

# define zero structure
S <- covML(Y)
S[1:3, 6:8] <- 0
S[6:8, 1:3] <- 0
zeros <- which(S==0, arr.ind=TRUE)

# obtain (triangulated) support info
supportP <- support4ridgeP(nNodes=p, zeros=zeros)

# estimate precision matrix with known (triangulated) support
Phat <- ridgePchordal(S, 0.1, zeros=supportP$zeros,
	cliques=supportP$cliques, separators=supportP$separators)

rags2ridges

Ridge Estimation of Precision Matrices from High-Dimensional Data

v2.2.4

GPL (>= 2)

Authors

Carel F.W. Peeters [cre, aut], Anders Ellern Bilgrau [aut], Wessel N. van Wieringen [aut]

Initial release