Coordinate Descent algorithm to solve Elastic-Net-type problems
Computes the entire Elastic-Net solution for the regression coefficients for all values of the penalization parameter, via the Coordinate Descent (CD) algorithm (Friedman, 2007). It uses as inputs a variance matrix among predictors and a covariance vector between response and predictors
solveEN(P, v, alpha = 1, lambda = NULL, nLambda = 100, minLambda = .Machine$double.eps^0.5, scale = TRUE, tol = 1e-05, maxIter = 1000, verbose = FALSE)
P |
(numeric matrix) Variance-covariance matrix of predictors. It can be of the "float32" type as per the 'float' R-package |
v |
(numeric vector) Covariance between response variable and predictors |
lambda |
(numeric vector) Penalization parameter sequence. Default is alpha=0 the grid is generated starting from a maximum equal to 5
|
nLambda |
(integer) Number of lambdas generated when |
minLambda |
(numeric) Minimum value of lambda that are generated when |
alpha |
(numeric) Value between 0 and 1 for the weights given to the L1 and L2-penalties |
scale |
|
tol |
(numeric) Maximum error between two consecutive solutions of the CD algorithm to declare convergence |
maxIter |
(integer) Maximum number of iterations to run the CD algorithm at each lambda step before convergence is reached |
verbose |
|
Finds solutions for the regression coefficients in a linear model
where yi is the response for the ith observation, xi=(xi1,...,xip)' is a vector of p predictors assumed to have unit variance, β=(β1,...,βp)' is a vector of regression coefficients, and ei is a residual.
The regression coefficients β are estimated as function of the variance matrix among predictors (P) and the covariance vector between response and predictors (v) by minimizing the penalized mean squared error function
where λ is the penalization parameter and J(β) is a penalty function given by
where 0 ≤ α ≤ 1, and ||β||1 = ∑|βj| and ||β||22 = ∑βj2 are the L1 and (squared) L2-norms, respectively.
The "partial residual" excluding the contribution of the predictor xij is
then the ordinary least-squares (OLS) coefficient of xij on this residual is (up-to a constant)
where vj is the jth element of v and Pj is the jth column of the matrix P.
Coefficients are updated for each j=1,...,p from their current value βj to a new value βj(α,λ), given α and λ, by "soft-thresholding" their OLS estimate until convergence as fully described in Friedman (2007).
Returns a list object containing the elements:
lambda
: (vector) all the sequence of values of the penalty.
beta
: (matrix) regression coefficients for each predictor (in columns) associated to each value of the penalization parameter lambda (in rows).
df
: (vector) degrees of freedom, number of non-zero predictors associated to each value of lambda.
The returned object is of the class 'LASSO' for which methods fitted
exist. Function 'plotPath' can be also used
Marco Lopez-Cruz (maraloc@gmail.com) and Gustavo de los Campos
Friedman J, Hastie T, Höfling H, Tibshirani R (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1(2), 302–332.
Hoerl AE, Kennard RW (1970). Ridge Regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.
Tibshirani R (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society B, 58(1), 267–288.
Zou H, Hastie T (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B, 67(2), 301–320.
require(SFSI) data(wheatHTP) y = as.vector(Y[,"YLD"]) # Response variable X = scale(WL) # Predictors # Training and testing sets tst = 1:ceiling(0.3*length(y)) trn = seq_along(y)[-tst] # Calculate covariances in training set XtX = var(X[trn,]) Xty = cov(y[trn],X[trn,]) # Run the penalized regression fm1 = solveEN(XtX,Xty,alpha=0.5) # Predicted values yHat1 = fitted(fm1, X=X[trn,]) # training data yHat2 = fitted(fm1, X=X[tst,]) # testing data # Penalization vs correlation plot(-log(fm1$lambda),cor(y[trn],yHat1)[1,], main="training") plot(-log(fm1$lambda),cor(y[tst],yHat2)[1,], main="testing") if(requireNamespace("float")){ # Using a 'float' type variable XtX2 = float::fl(XtX) fm2 = solveEN(XtX2,Xty,alpha=0.5) max(abs(fm1$beta-fm2$beta)) # Check for discrepances }
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.