Simultaneous Component Analysis
Fits Timmerman and Kiers's four Simultaneous Component Analysis (SCA) models to a 3-way data array or a list of 2-way arrays with the same number of columns.
sca(X, nfac, nstart = 10, maxit = 500, type = c("sca-p", "sca-pf2", "sca-ind", "sca-ecp"), rotation = c("none", "varimax", "promax"), ctol = 1e-4, parallel = FALSE, cl = NULL, verbose = TRUE)
X |
List of length |
nfac |
Number of factors. |
nstart |
Number of random starts. |
maxit |
Maximum number of iterations. |
type |
Type of SCA model to fit. |
rotation |
Rotation to use for |
ctol |
Convergence tolerance. |
parallel |
Logical indicating if |
cl |
Cluster created by |
verbose |
If |
Given a list of matrices X[[k]] = matrix(xk,I[k],J)
for k = seq(1,K)
, the SCA model is
X[[k]] = tcrossprod(D[[k]],B) + E[[k]] |
where D[[k]] = matrix(dk,I[k],R)
are the Mode A (first mode) weights for the k
-th level of Mode C (third mode), B = matrix(b,J,R)
are the Mode B (second mode) weights, and E[[k]] = matrix(ek,I[k],J)
is the residual matrix corresponding to k
-th level of Mode C.
There are four different versions of the SCA model: SCA with invariant pattern (SCA-P), SCA with Parafac2 constraints (SCA-PF2), SCA with INDSCAL constraints (SCA-IND), and SCA with equal average crossproducts (SCA-ECP). These four models differ with respect to the assumed crossproduct structure of the D[[k]]
weights:
SCA-P: | crossprod(D[[k]])/I[k] = Phi[[k]] |
|
SCA-PF2: | crossprod(D[[k]])/I[k] = diag(C[k,])%*%Phi%*%diag(C[k,]) |
|
SCA-IND: | crossprod(D[[k]])/I[k] = diag(C[k,]*C[k,]) |
|
SCA-ECP: | crossprod(D[[k]])/I[k] = Phi |
|
where Phi[[k]]
is specific to the k
-th level of Mode C, Phi
is common to all K
levels of Mode C, and C = matrix(c,K,R)
are the Mode C (third mode) weights. This function estimates the weight matrices D[[k]]
and B
(and C
if applicable) using alternating least squares.
D |
List of length |
B |
Mode B weight matrix. |
C |
Mode C weight matrix. |
Phi |
Mode A common crossproduct matrix (if |
SSE |
Sum of Squared Errors. |
Rsq |
R-squared value. |
GCV |
Generalized Cross-Validation. |
edf |
Effective degrees of freedom. |
iter |
Number of iterations. |
cflag |
Convergence flag. |
type |
Same as input |
rotation |
Same as input |
The ALS algorithm can perform poorly if the number of factors nfac
is set too large.
The least squares SCA-P solution can be obtained from the singular value decomposition of the stacked matrix rbind(X[[1]],...,X[[K]])
.
The least squares SCA-PF2 solution can be obtained using the uncontrained Parafac2 ALS algorithm (see parafac2
).
The least squares SCA-IND solution can be obtained using the Parafac2 ALS algorithm with orthogonality constraints on Mode A.
The least squares SCA-ECP solution can be obtained using the Parafac2 ALS algorithm with orthogonality constraints on Mode A and the Mode C weights fixed at C[k,] = rep(I[k]^0.5,R)
.
Default use is 10 random strarts (nstart=10
) with 500 maximum iterations of the ALS algorithm for each start (maxit=500
) using a convergence tolerance of 1e-4 (ctol=1e-4
). The algorithm is determined to have converged once the change in R^2 is less than or equal to ctol
.
Output cflag
gives convergence information: cflag=0
if ALS algorithm converged normally, cflag=1
if maximum iteration limit was reached before convergence, and cflag=2
if ALS algorithm terminated abnormally due to problem with non-negativity constraints.
Nathaniel E. Helwig <helwig@umn.edu>
Helwig, N. E. (2013). The special sign indeterminacy of the direct-fitting Parafac2 model: Some implications, cautions, and recommendations, for Simultaneous Component Analysis. Psychometrika, 78, 725-739.
Timmerman, M. E., & Kiers, H. A. L. (2003). Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika, 68, 105-121.
########## sca-p ########## # create random data list with SCA-P structure set.seed(3) mydim <- c(NA,10,20) nf <- 2 nk <- rep(c(50,100,200), length.out = mydim[3]) Dmat <- matrix(rnorm(sum(nk)*nf),sum(nk),nf) Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf) Dmats <- vector("list",mydim[3]) Xmat <- Emat <- vector("list",mydim[3]) dfc <- 0 for(k in 1:mydim[3]){ dinds <- 1:nk[k] + dfc Dmats[[k]] <- Dmat[dinds,] dfc <- dfc + nk[k] Xmat[[k]] <- tcrossprod(Dmats[[k]],Bmat) Emat[[k]] <- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2]) } rm(Dmat) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR=1 X <- mapply("+",Xmat,Emat) # fit SCA-P model (no rotation) scamod <- sca(X,nfac=nf,nstart=1) scamod # check solution crossprod(scamod$D[[1]] %*% diag(scamod$C[1,]^-1) ) / nk[1] crossprod(scamod$D[[5]] %*% diag(scamod$C[5,]^-1) ) / nk[5] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # reorder and resign factors scamod$B[1:4,] scamod <- reorder(scamod, 2:1) scamod$B[1:4,] scamod <- resign(scamod, mode="B", newsign=c(1,-1)) scamod$B[1:4,] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # rescale factors colSums(scamod$B^2) colSums(scamod$C^2) scamod <- rescale(scamod, mode="C") colSums(scamod$B^2) colSums(scamod$C^2) Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) ########## sca-pf2 ########## # create random data list with SCA-PF2 (Parafac2) structure set.seed(3) mydim <- c(NA,10,20) nf <- 2 nk <- rep(c(50,100,200), length.out = mydim[3]) Gmat <- 10*matrix(rnorm(nf^2),nf,nf) Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf) Cmat <- matrix(runif(mydim[3]*nf),mydim[3],nf) Xmat <- Emat <- Fmat <- vector("list",mydim[3]) for(k in 1:mydim[3]){ Fmat[[k]] <- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u Xmat[[k]] <- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]] <- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2]) } Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR=1 X <- mapply("+",Xmat,Emat) # fit SCA-PF2 model scamod <- sca(X,nfac=nf,nstart=1,type="sca-pf2") scamod # check solution scamod$Phi crossprod(scamod$D[[1]] %*% diag(scamod$C[1,]^-1) ) / nk[1] crossprod(scamod$D[[5]] %*% diag(scamod$C[5,]^-1) ) / nk[5] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # reorder and resign factors scamod$B[1:4,] scamod <- reorder(scamod, 2:1) scamod$B[1:4,] scamod <- resign(scamod, mode="B", newsign=c(1,-1)) scamod$B[1:4,] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # rescale factors colSums(scamod$B^2) colSums(scamod$C^2) scamod <- rescale(scamod, mode="C") colSums(scamod$B^2) colSums(scamod$C^2) Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) ########## sca-ind ########## # create random data list with SCA-IND structure set.seed(3) mydim <- c(NA,10,20) nf <- 2 nk <- rep(c(50,100,200), length.out = mydim[3]) Gmat <- diag(nf) # SCA-IND is Parafac2 with Gmat=identity Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf) Cmat <- 10*matrix(runif(mydim[3]*nf),mydim[3],nf) Xmat <- Emat <- Fmat <- vector("list",mydim[3]) for(k in 1:mydim[3]){ Fmat[[k]] <- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u Xmat[[k]] <- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]] <- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2]) } Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR=1 X <- mapply("+",Xmat,Emat) # fit SCA-IND model scamod <- sca(X,nfac=nf,nstart=1,type="sca-ind") scamod # check solution scamod$Phi crossprod(scamod$D[[1]] %*% diag(scamod$C[1,]^-1) ) / nk[1] crossprod(scamod$D[[5]] %*% diag(scamod$C[5,]^-1) ) / nk[5] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # reorder and resign factors scamod$B[1:4,] scamod <- reorder(scamod, 2:1) scamod$B[1:4,] scamod <- resign(scamod, mode="B", newsign=c(1,-1)) scamod$B[1:4,] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # rescale factors colSums(scamod$B^2) colSums(scamod$C^2) scamod <- rescale(scamod, mode="C") colSums(scamod$B^2) colSums(scamod$C^2) Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) ########## sca-ecp ########## # create random data list with SCA-ECP structure set.seed(3) mydim <- c(NA,10,20) nf <- 2 nk <- rep(c(50,100,200), length.out = mydim[3]) Gmat <- diag(nf) Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf) Cmat <- matrix(sqrt(nk),mydim[3],nf) Xmat <- Emat <- Fmat <- vector("list",mydim[3]) for(k in 1:mydim[3]){ Fmat[[k]] <- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u Xmat[[k]] <- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]] <- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2]) } Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR=1 X <- mapply("+",Xmat,Emat) # fit SCA-ECP model scamod <- sca(X,nfac=nf,nstart=1,type="sca-ecp") scamod # check solution scamod$Phi crossprod(scamod$D[[1]] %*% diag(scamod$C[1,]^-1) ) / nk[1] crossprod(scamod$D[[5]] %*% diag(scamod$C[5,]^-1) ) / nk[5] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # reorder and resign factors scamod$B[1:4,] scamod <- reorder(scamod, 2:1) scamod$B[1:4,] scamod <- resign(scamod, mode="B", newsign=c(-1,1)) scamod$B[1:4,] Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) # rescale factors colSums(scamod$B^2) colSums(scamod$C^2) scamod <- rescale(scamod, mode="B") colSums(scamod$B^2) colSums(scamod$C^2) Xhat <- fitted(scamod) sse <- sumsq(mapply("-",Xmat,Xhat)) sse/(sum(nk)*mydim[2]) ## Not run: ########## parallel computation ########## # create random data list with SCA-IND structure set.seed(3) mydim <- c(NA,10,20) nf <- 2 nk <- rep(c(50,100,200), length.out = mydim[3]) Gmat <- diag(nf) # SCA-IND is Parafac2 with Gmat=identity Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf) Cmat <- 10*matrix(runif(mydim[3]*nf),mydim[3],nf) Xmat <- Emat <- Fmat <- vector("list",mydim[3]) for(k in 1:mydim[3]){ Fmat[[k]] <- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u Xmat[[k]] <- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]] <- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2]) } Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR=1 X <- mapply("+",Xmat,Emat) # fit SCA-PF2 model (10 random starts -- sequential computation) set.seed(1) system.time({scamod <- sca(X,nfac=nf,type="sca-pf2")}) scamod # fit SCA-PF2 model (10 random starts -- parallel computation) cl <- makeCluster(detectCores()) ce <- clusterEvalQ(cl,library(multiway)) clusterSetRNGStream(cl, 1) system.time({scamod <- sca(X,nfac=nf,type="sca-pf2",parallel=TRUE,cl=cl)}) scamod stopCluster(cl) # fit SCA-IND model (10 random starts -- sequential computation) set.seed(1) system.time({scamod <- sca(X,nfac=nf,type="sca-ind")}) scamod # fit SCA-IND model (10 random starts -- parallel computation) cl <- makeCluster(detectCores()) ce <- clusterEvalQ(cl,library(multiway)) clusterSetRNGStream(cl, 1) system.time({scamod <- sca(X,nfac=nf,type="sca-ind",parallel=TRUE,cl=cl)}) scamod stopCluster(cl) # fit SCA-ECP model (10 random starts -- sequential computation) set.seed(1) system.time({scamod <- sca(X,nfac=nf,type="sca-ecp")}) scamod # fit SCA-ECP model (10 random starts -- parallel computation) cl <- makeCluster(detectCores()) ce <- clusterEvalQ(cl,library(multiway)) clusterSetRNGStream(cl, 1) system.time({scamod <- sca(X,nfac=nf,type="sca-ecp",parallel=TRUE,cl=cl)}) scamod stopCluster(cl) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.