Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

llm.cv

Runs v-fold cross validation with LLM


Description

In v-fold cross validation, the data are divided into v subsets of approximately equal size. Subsequently, one of the v data parts is excluded while the remaider of the data is used to create a logitleafmodel object. Predictions are generated for the excluded data part. The process is repeated v times.

Usage

llm.cv(X, Y, cv, threshold_pruning = 0.25, nbr_obs_leaf = 100)

Arguments

X

Dataframe containing numerical independent variables.

Y

Numerical vector of dependent variable. Currently only binary classification is supported.

cv

An integer specifying the number of folds in the cross-validation.

threshold_pruning

Set confidence threshold for pruning. Default 0.25.

nbr_obs_leaf

The minimum number of observations in a leaf node. Default 100.

Value

An object of class llm.cv, which is a list with the following components:

foldpred

a data frame with, per fold, predicted class membership probabilities for the left-out observations

pred

a data frame with predicted class membership probabilities.

foldclass

a data frame with, per fold, predicted classes for the left-out observations.

class

a data frame with the predicted classes.

conf

the confusion matrix which compares the real versus the predicted class memberships based on the class object.

Author(s)

Arno De Caigny, a.de-caigny@ieseg.fr, Kristof Coussement, k.coussement@ieseg.fr and Koen W. De Bock, kdebock@audencia.com

References

Arno De Caigny, Kristof Coussement, Koen W. De Bock, A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees, European Journal of Operational Research (2018), doi: 10.1016/j.ejor.2018.02.009.

See Also

Examples

## Load PimaIndiansDiabetes dataset from mlbench package
if (requireNamespace("mlbench", quietly = TRUE)) {
  library("mlbench")
}
data("PimaIndiansDiabetes")
## Create the LLM with 5-cv
Pima.llm <- llm.cv(X = PimaIndiansDiabetes[,-c(9)],Y = PimaIndiansDiabetes$diabetes, cv=5,
 threshold_pruning = 0.25,nbr_obs_leaf = 100)

LLM

Logit Leaf Model Classifier for Binary Classification

v1.1.0
GPL (>= 3)
Authors
Arno De Caigny [aut, cre], Kristof Coussement [aut], Koen W. De Bock [aut]
Initial release
2020-05-05

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.