N-way Canonical Polyadic Decomposition
Fits Frank L. Hitchcock's Canonical Polyadic Decomposition (CPD) to N-way data arrays. Parameters are estimated via alternating least squares.
cpd(X, nfac, nstart = 10, maxit = 500, ctol = 1e-4, parallel = FALSE, cl = NULL, output = "best", verbose = TRUE)
X |
N-way data array. Missing data are allowed (see Note). |
nfac |
Number of factors. |
nstart |
Number of random starts. |
maxit |
Maximum number of iterations. |
ctol |
Convergence tolerance (R^2 change). |
parallel |
Logical indicating if |
cl |
Cluster created by |
output |
Output the best solution (default) or output all |
verbose |
If |
X[i.1, ..., i.N] = sum A.1[i.1,r] * ... * A.N[i.N,r] + E[i.1, ..., i.N]
|
where A.n
are the n-th mode's weights for n = 1, ..., N, and E
is the N-way residual array. The summation is for r = seq(1,R)
.
A |
List of length N containing the weights for each mode. |
SSE |
Sum of Squared Errors. |
Rsq |
R-squared value. |
GCV |
Generalized Cross-Validation. |
edf |
Effective degrees of freedom. |
iter |
Number of iterations. |
cflag |
Convergence flag. See Note. |
The algorithm can perform poorly if the number of factors nfac
is set too large.
Missing data should be specified as NA
values in the input X
. The missing data are randomly initialized and then iteratively imputed as a part of the algorithm.
Output cflag
gives convergence information: cflag = 0
if algorithm converged normally and cflag = 1
if maximum iteration limit was reached before convergence.
Nathaniel E. Helwig <helwig@umn.edu>
Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84.
Harshman, R. A., & Lundy, M. E. (1994). PARAFAC: Parallel factor analysis. Computational Statistics and Data Analysis, 18, 39-72.
Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6, 164-189.
The parafac
function provides a more flexible implemention for 3-way and 4-way arrays.
The fitted.cpd
function creates the model-implied fitted values from a fit "cpd"
object.
The resign.cpd
function can be used to resign factors from a fit "cpd"
object.
The rescale.cpd
function can be used to rescale factors from a fit "cpd"
object.
The reorder.cpd
function can be used to reorder factors from a fit "cpd"
object.
########## 3-way example ########## # create random data array with CPD/Parafac structure set.seed(3) mydim <- c(50, 20, 5) nf <- 3 Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf) Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf) Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf) Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat)) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano # fit Parafac model set.seed(0) pfac <- parafac(X, nfac = nf, nstart = 1) pfac ########## 4-way example ########## # create random data array with CPD/Parafac structure set.seed(4) mydim <- c(30,10,8,10) nf <- 4 aseq <- seq(-3, 3, length.out = mydim[1]) Amat <- cbind(dnorm(aseq), dchisq(aseq+3.1, df=3), dt(aseq-2, df=4), dgamma(aseq+3.1, shape=3, rate=1)) Bmat <- svd(matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf), nv = 0)$u Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf) Cstruc <- Cmat > 0.5 Cmat <- Cmat * Cstruc Dmat <- matrix(runif(mydim[4]*nf), nrow = mydim[4], ncol = nf) Xmat <- tcrossprod(Amat, krprod(Dmat, krprod(Cmat, Bmat))) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano # fit Parafac model set.seed(0) pfac <- parafac(X, nfac = nf, nstart = 1) pfac ########## 5-way example ########## # create random data array with CPD/Parafac structure set.seed(5) mydim <- c(5, 6, 7, 8, 9) nmode <- length(mydim) nf <- 3 Amat <- vector("list", nmode) for(n in 1:nmode) { Amat[[n]] <- matrix(rnorm(mydim[n] * nf), mydim[n], nf) } Zmat <- krprod(Amat[[3]], Amat[[2]]) for(n in 4:5) Zmat <- krprod(Amat[[n]], Zmat) Xmat <- tcrossprod(Amat[[1]], Zmat) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.