Classification tree evaluation by CV
Evaluation for classification trees by cross-validation
treeEval(X, grp, train, kfold = 10, cp = seq(0.01, 0.1, by = 0.01), plotit = TRUE, legend = TRUE, legpos = "bottomright", ...)
X |
standardized complete X data matrix (training and test data) |
grp |
factor with groups for complete data (training and test data) |
train |
row indices of X indicating training data objects |
kfold |
number of folds for cross-validation |
cp |
range for tree complexity parameter, see |
plotit |
if TRUE a plot will be generated |
legend |
if TRUE a legend will be added to the plot |
legpos |
positioning of the legend in the plot |
... |
additional plot arguments |
The data are split into a calibration and a test data set (provided by "train"). Within the calibration set "kfold"-fold CV is performed by applying the classification method to "kfold"-1 parts and evaluation for the last part. The misclassification error is then computed for the training data, for the CV test data (CV error) and for the test data.
trainerr |
training error rate |
testerr |
test error rate |
cvMean |
mean of CV errors |
cvSe |
standard error of CV errors |
cverr |
all errors from CV |
cp |
range for tree complexity parameter, taken from input |
Peter Filzmoser <P.Filzmoser@tuwien.ac.at>
K. Varmuza and P. Filzmoser: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton, FL, 2009.
data(fgl,package="MASS") grp=fgl$type X=scale(fgl[,1:9]) k=length(unique(grp)) dat=data.frame(grp,X) n=nrow(X) ntrain=round(n*2/3) require(rpart) set.seed(123) train=sample(1:n,ntrain) par(mar=c(4,4,3,1)) restree=treeEval(X,grp,train,cp=c(0.01,0.02:0.05,0.1,0.15,0.2:0.5,1)) title("Classification trees")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.