Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

CVlm

Cross-Validation for Linear Regression


Description

This function gives internal and cross-validation measures of predictive accuracy for multiple linear regression. (For binary logistic regression, use the CVbinary function.) The data are randomly assigned to a number of ‘folds’. Each fold is removed, in turn, while the remaining data is used to re-fit the regression model and to predict at the deleted observations.

Usage

CVlm(data = DAAG::houseprices, form.lm = formula(sale.price ~ area),
              m = 3, dots = FALSE, seed = 29, plotit = c("Observed","Residual"),
              main="Small symbols show cross-validation predicted values",
              legend.pos="topleft", printit = TRUE)
cv.lm(data = DAAG::houseprices, form.lm = formula(sale.price ~ area),
              m = 3, dots = FALSE, seed = 29, plotit = c("Observed","Residual"),
              main="Small symbols show cross-validation predicted values",
              legend.pos="topleft", printit = TRUE)

Arguments

data

a data frame

form.lm

a formula or lm call or lm object

m

the number of folds

dots

uses pch=16 for the plotting character

seed

random number generator seed

plotit

This can be one of the text strings "Observed", "Residual", or a logical value. The logical TRUE is equivalent to "Observed", while FALSE is equivalent to "" (no plot)

main

main title for graph

legend.pos

position of legend: one of "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center".

printit

if TRUE, output is printed to the screen

Details

When plotit="Residual" and there is more than one explanatory variable, the fitted lines that are shown for the individual folds are approximations.

Value

The input data frame is returned, with additional columns Predicted (Predicted values using all observations) and cvpred (cross-validation predictions). The cross-validation residual sum of squares (ss) and degrees of freedom (df) are returned as attributes of the data frame.

Author(s)

J.H. Maindonald

See Also

Examples

CVlm()
## Not run: 
CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
          plotit="Observed")
CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
     plotit="Residual")
out <- CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
               plotit="Observed")
out[c("ms","df")]

## End(Not run)

DAAG

Data Analysis and Graphics Data and Functions

v1.24
GPL-3
Authors
John H. Maindonald and W. John Braun
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.