Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ROCc

The Receiver Operating Characteristic (ROC) Curve


Description

Computes the exact area under the ROC curve (AUROC), the Gini coefficient, and the Kolmogorov-Smirnov (KS) statistic for a binary classifier. Optionally, this function can plot the ROC curve, that is, the plot of the estimates of Sensitivity versus the estimates of 1-Specificity.

Usage

ROCc(object, plot.it = TRUE, verbose = TRUE, ...)

Arguments

object

a matrix with two columns: the first one is a numeric vector of 1's and 0's indicating whether each row is a "success" or a "failure"; the second one is a numeric vector of values indicating the probability (or propensity score) of each row to be a "success". Optionally, object can be an object of the class glm which is obtained from the fit of a generalized linear model where the distribution of the response variable is assumed to be binomial.

plot.it

an (optional) logical switch indicating if the plot of the ROC curve is required or just the data matrix in which it is based. By default, plot.it is set to be TRUE.

verbose

an (optional) logical switch indicating if should the report of results be printed. By default, verbose is set to be TRUE.

...

further arguments passed to or from other methods. For example, if plot.it=TRUE then ... may to include graphical parameters as col, pch, cex, main, sub, xlab, ylab.

Value

A list which contains the following objects:

  • roc: A matrix with the Cutoffs and the associated estimates of Sensitivity and Specificity.

  • auroc: The exact area under the ROC curve.

  • gini: The value of the Gini coefficient computed as 2(auroc-0.5).

  • ks: The value of the Kolmogorov-Smirnov statistic computed as the maximum value of |1-Sensitivity-Specificity|.

References

Hanley J.A. and McNeil B.J. (1982) The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 143, 29–36.

Examples

burn1000 <- aplore3::burn1000

## splitting the sample
## 70% for the training sample and 30% for the validation sample
burn1000 <- within(burn1000, sampleof <- "validation")
s <- sample(nrow(burn1000),nrow(burn1000)*0.7)
burn1000$sampleof[s] <- "training"

mod <- death ~ age + tbsa + inh_inj + age*inh_inj + tbsa*inh_inj
training <- subset(burn1000,sampleof=="training")
fit <- glm(mod, family=binomial("logit"), data=training)

## ROC curve for the training sample
ROCc(fit, col="red", col.lab="blue", col.axis="black",
     col.main="black", family="mono")

validation <- subset(burn1000, sampleof=="validation")
probs <- predict(fit, newdata=validation, type="response")
responses <- with(validation, ifelse(death=="Dead",1,0))

## ROC curve for the validation sample
ROCc(cbind(responses,probs), col="red", col.lab="blue",
     col.axis="black", col.main="black", family="mono")

glmtoolbox

Set of Tools to Data Analysis using Generalized Linear Models

v0.1.0
GPL-2 | GPL-3
Authors
Luis Hernando Vanegas [aut, cre], Luz Marina Rondón [aut], Gilberto A. Paula [aut]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.