False Positive Rate for arbitrary predition models
Estimates the false positive rate (based on concept of Uno, et al.) for an arbitrary discrete survival prediction model on one test data set.
fprUnoShort(timepoint, marker, newTime)
timepoint |
Gives the discrete time interval of which the fpr is evaluated (numeric scalar). |
marker |
Gives the predicted values of the linear predictor of a regression model (numeric vector). May also be on the response scale. |
newTime |
New time intervals in the test data (integer vector). |
This function is useful, if other models than generalized, linear models (glm) should be used for prediction. In the case of glm better use the cross validation version fprUno
.
Output: A list with objects:
Output: Data frame with two columns: "cutoff" gives the different marker values and "fpr" the false positive rates
Input: A list of given argument input values (saved for reference). In addition there is the list element orderMarker
, which gives the indices of the marker values in increasing order.
It is assumed that all time points up to the last observed interval [a_q-1, a_q) are available.
Thomas Welchowski welchow@imbie.meb.uni-bonn.de
Matthias Schmid matthias.schmid@imbie.uni-bonn.de
Matthias Schmid, Gerhard Tutz and Thomas Welchowski, (2017), Discrimination Measures for Discrete Time-to-Event Predictions, Econometrics and Statistics, Elsevier, Doi: 10.1016/j.ecosta.2017.03.008
Hajime Uno and Tianxi Cai and Lu Tian and L. J. Wei, (2007), Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models, Journal of the American Statistical Association
Patrick J. Heagerty and Yingye Zheng, (2005), Survival Model Predictive Accuracy and ROC Curves, Biometrics 61, 92-105
################################################## # Example with unemployment data and prior fitting library(Ecdat) library(caret) library(mgcv) data(UnempDur) summary(UnempDur$spell) # Extract subset of data set.seed(635) IDsample <- sample(1:dim(UnempDur)[1], 100) UnempDurSubset <- UnempDur [IDsample, ] set.seed(-570) TrainingSample <- sample(1:100, 75) UnempDurSubsetTrain <- UnempDurSubset [TrainingSample, ] UnempDurSubsetTest <- UnempDurSubset [-TrainingSample, ] # Convert to long format UnempDurSubsetTrainLong <- dataLong(dataSet=UnempDurSubsetTrain, timeColumn="spell", censColumn="censor1") # Estimate gam with smooth baseline gamFit <- gam(formula=y ~ s(I(as.numeric(as.character(timeInt)))) + s(age) + s(logwage), data=UnempDurSubsetTrainLong, family=binomial()) gamFitPreds <- predict(gamFit, newdata=cbind(UnempDurSubsetTest, timeInt=UnempDurSubsetTest$spell)) # Estimate false positive rate fprGamFit <- fprUnoShort (timepoint=1, marker=gamFitPreds, newTime=UnempDurSubsetTest$spell) plot(fprGamFit) ##################################### # Example National Wilm's Tumor Study library(survival) head(nwtco) summary(nwtco$rel) # Select subset set.seed(-375) Indices <- sample(1:dim(nwtco)[1], 500) nwtcoSub <- nwtco [Indices, ] # Convert time range to 30 intervals intLim <- quantile(nwtcoSub$edrel, prob=seq(0, 1, length.out=30)) intLim [length(intLim)] <- intLim [length(intLim)] + 1 nwtcoSubTemp <- contToDisc(dataSet=nwtcoSub, timeColumn="edrel", intervalLimits=intLim) nwtcoSubTemp$instit <- factor(nwtcoSubTemp$instit) nwtcoSubTemp$histol <- factor(nwtcoSubTemp$histol) nwtcoSubTemp$stage <- factor(nwtcoSubTemp$stage) # Split in training and test sample set.seed(-570) TrainingSample <- sample(1:dim(nwtcoSubTemp)[1], round(dim(nwtcoSubTemp)[1]*0.75)) nwtcoSubTempTrain <- nwtcoSubTemp [TrainingSample, ] nwtcoSubTempTest <- nwtcoSubTemp [-TrainingSample, ] # Convert to long format nwtcoSubTempTrainLong <- dataLong(dataSet=nwtcoSubTempTrain, timeColumn="timeDisc", censColumn="rel") # Estimate glm inputFormula <- y ~ timeInt + histol + instit + stage glmFit <- glm(formula=inputFormula, data=nwtcoSubTempTrainLong, family=binomial()) linPreds <- predict(glmFit, newdata=cbind(nwtcoSubTempTest, timeInt=nwtcoSubTempTest$timeDisc)) # Estimate tpr given one training and one test sample at time interval 10 fprFit <- fprUnoShort (timepoint=10, marker=linPreds, newTime=nwtcoSubTempTest$timeDisc) plot(fprFit)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.