Split-sample-derived Shrinkage After Estimation
Shrink regression coefficients using a split-sample-derived shrinkage factor.
splitval(dataset, model, nrounds, fract, sdm, int = TRUE, int.adj)
dataset |
a dataset for regression analysis. Data should be in the form
of a matrix, with the outcome variable as the final column. Application of the
|
model |
type of regression model. Either "linear" or "logistic". |
nrounds |
the number of times to replicate the sample splitting process. |
fract |
the fraction of observations designated to the training set |
sdm |
a shrinkage design matrix. For examples, see |
int |
logical. If TRUE the model will include a regression intercept. |
int.adj |
logical. If TRUE the regression intercept will be re-estimated after shrinkage of the regression coefficients. |
This function applies sample-splitting to a dataset in order to derive a shrinkage factor and apply it to the regression coefficients. Data are randomly split into two sets, a training set and a test set. Regression coefficients are estimated using the training sample, and then a shrinkage factor is estimated using the test set. The mean of N shrinkage factors is then applied to the original regression coeffients, and the regression intercept may be re-estimated.
This process can currently be applied to linear or logistic regression models.
splitval
returns a list containing the following:
raw.coeff |
the raw regression model coefficients, pre-shrinkage. |
shrunk.coeff |
the shrunken regression model coefficients |
lambda |
the mean shrinkage factor over Nrounds split-sample replicates |
Nrounds |
the number of rounds of sample splitting |
sdm |
the shrinkage design matrix used to apply the shrinkage factor(s) to the regression coefficients |
## Example 1: Linear regression using the iris dataset ## Split-sample-derived shrinkage with 100 rounds of sample-splitting data(iris) iris.data <- as.matrix(iris[, 1:4]) iris.data <- cbind(1, iris.data) sdm1 <- matrix(c(0, 1, 1, 1), nrow = 1) set.seed(321) splitval(dataset = iris.data, model = "linear", nrounds = 100, fract = 0.75, sdm = sdm1, int = TRUE, int.adj = TRUE) ## Example 2: logistic regression using a subset of the mtcars data ## Split-sample-derived shrinkage data(mtcars) mtc.data <- cbind(1,datashape(mtcars, y = 8, x = c(1, 6, 9))) head(mtc.data) set.seed(123) splitval(dataset = mtc.data, model = "logistic", nrounds = 100, fract = 0.5)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.