discSurv: tprUnoShort – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

tprUnoShort

True Positive Rate for arbitrary predition models

Description

Estimates the true positive rate (based on concept of Uno, et al.) for an arbitrary discrete survival prediction model on one test data set.

Usage

tprUnoShort(timepoint, marker, newTime, newEvent, trainTime, trainEvent)

Arguments

`timepoint`	Gives the discrete time interval of which the tpr is evaluated (numeric scalar).
`marker`	Gives the predicted values of the linear predictor of a regression model (numeric vector). May also be on the response scale.
`newTime`	New time intervals in the test data (integer vector).
`newEvent`	New event indicators in the test data (integer vector with 0 or 1).
`trainTime`	Time intervals in the training data (integer vector).
`trainEvent`	Event indicators in the training data (integer vector with 0 or 1).

Details

This function is useful, if other models than generalized, linear models (glm) should be used for prediction. In the case of glm better use the cross validation version tprUno.

Value

List with objects

Output Data frame with two columns: "cutoff" gives the different marker values and "fpr" the false positive rates
Input A list of given argument input values (saved for reference). Another list element is selectInd, which gives the selected indices of the marker values with time intervals available in both training and test sets. In addition there is the list element orderMarker, which gives the indices of the marker values in increasing order.

Note

It is assumed that all time points up to the last observed interval [a_q-1, a_q) are available.

Author(s)

Thomas Welchowski welchow@imbie.meb.uni-bonn.de

Matthias Schmid matthias.schmid@imbie.uni-bonn.de

References

Matthias Schmid, Gerhard Tutz and Thomas Welchowski, (2017), Discrimination Measures for Discrete Time-to-Event Predictions, Econometrics and Statistics, Elsevier, Doi: 10.1016/j.ecosta.2017.03.008

Hajime Uno and Tianxi Cai and Lu Tian and L. J. Wei, (2007), Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models, Journal of the American Statistical Association

Patrick J. Heagerty and Yingye Zheng, (2005), Survival Model Predictive Accuracy and ROC Curves, Biometrics 61, 92-105

Examples

##################################################
# Example with unemployment data and prior fitting

library(Ecdat)
library(caret)
library(mgcv)
data(UnempDur)
summary(UnempDur$spell)
# Extract subset of data
set.seed(635)
IDsample <- sample(1:dim(UnempDur)[1], 100)
UnempDurSubset <- UnempDur [IDsample, ]
set.seed(-570)
TrainingSample <- sample(1:100, 75)
UnempDurSubsetTrain <- UnempDurSubset [TrainingSample, ]
UnempDurSubsetTest <- UnempDurSubset [-TrainingSample, ]

# Convert to long format
UnempDurSubsetTrainLong <- dataLong(dataSet=UnempDurSubsetTrain, 
timeColumn="spell", censColumn="censor1")

# Estimate gam with smooth baseline
gamFit <- gam(formula=y ~ s(I(as.numeric(as.character(timeInt)))) + 
s(age) + s(logwage), data=UnempDurSubsetTrainLong, family=binomial())
gamFitPreds <- predict(gamFit, newdata=cbind(UnempDurSubsetTest, 
timeInt=UnempDurSubsetTest$spell))

# Estimate tpr given one training and one test sample
tprGamFit <- tprUnoShort (timepoint=1, marker=gamFitPreds, 
newTime=UnempDurSubsetTest$spell, newEvent=UnempDurSubsetTest$censor1, 
trainTime=UnempDurSubsetTrain$spell, trainEvent=UnempDurSubsetTrain$censor1)
plot(tprGamFit)

#####################################
# Example National Wilm's Tumor Study

library(survival)
head(nwtco)
summary(nwtco$rel)

# Select subset
set.seed(-375)
Indices <- sample(1:dim(nwtco)[1], 500)
nwtcoSub <- nwtco [Indices, ]

# Convert time range to 30 intervals
intLim <- quantile(nwtcoSub$edrel, prob=seq(0, 1, length.out=30))
intLim [length(intLim)] <- intLim [length(intLim)] + 1
nwtcoSubTemp <- contToDisc(dataSet=nwtcoSub, timeColumn="edrel", intervalLimits=intLim)
nwtcoSubTemp$instit <- factor(nwtcoSubTemp$instit)
nwtcoSubTemp$histol <- factor(nwtcoSubTemp$histol)
nwtcoSubTemp$stage <- factor(nwtcoSubTemp$stage)

# Split in training and test sample
set.seed(-570)
TrainingSample <- sample(1:dim(nwtcoSubTemp)[1], round(dim(nwtcoSubTemp)[1]*0.75))
nwtcoSubTempTrain <- nwtcoSubTemp [TrainingSample, ]
nwtcoSubTempTest <- nwtcoSubTemp [-TrainingSample, ]

# Convert to long format
nwtcoSubTempTrainLong <- dataLong(dataSet=nwtcoSubTempTrain, 
timeColumn="timeDisc", censColumn="rel")

# Estimate glm
inputFormula <- y ~ timeInt + histol + instit + stage
glmFit <- glm(formula=inputFormula, data=nwtcoSubTempTrainLong, family=binomial())
linPreds <- predict(glmFit, newdata=cbind(nwtcoSubTempTest, 
timeInt=nwtcoSubTempTest$timeDisc))

# Estimate tpr given one training and one test sample at time interval 5
tprFit <- tprUnoShort (timepoint=5, marker=linPreds, 
newTime=nwtcoSubTempTest$timeDisc, newEvent=nwtcoSubTempTest$rel, 
trainTime=nwtcoSubTempTrain$timeDisc, trainEvent=nwtcoSubTempTrain$rel)
plot(tprFit)

discSurv

Discrete Time Survival Analysis

v1.4.1

GPL-3

Authors

Thomas Welchowski <welchow@imbie.meb.uni-bonn.de> and Matthias Schmid <matthias.schmid@imbie.uni-bonn.de>

Initial release

2019-12-10