Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

predictRisk

Extrating predicting risks from regression models


Description

Extract event probabilities from fitted regression models and machine learning objects. The function predictRisk is a generic function, meaning that it invokes specifically designed functions depending on the 'class' of the first argument. See predictRisk.

Usage

predictRisk(object, newdata, ...)

## Default S3 method:
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'double'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'integer'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'factor'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'numeric'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'glm'
predictRisk(object, newdata, iid = FALSE, average.iid = FALSE, ...)

## S3 method for class 'formula'
predictRisk(object, newdata, ...)

## S3 method for class 'BinaryTree'
predictRisk(object, newdata, ...)

## S3 method for class 'lrm'
predictRisk(object, newdata, ...)

## S3 method for class 'rpart'
predictRisk(object, newdata, ...)

## S3 method for class 'randomForest'
predictRisk(object, newdata, ...)

## S3 method for class 'matrix'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'aalen'
predictRisk(object, newdata, times, ...)

## S3 method for class 'cox.aalen'
predictRisk(object, newdata, times, ...)

## S3 method for class 'coxph'
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'coxphTD'
predictRisk(object, newdata, times, landmark, ...)

## S3 method for class 'CSCTD'
predictRisk(object, newdata, times, cause, landmark, ...)

## S3 method for class 'coxph.penal'
predictRisk(object, newdata, times, ...)

## S3 method for class 'cph'
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'selectCox'
predictRisk(object, newdata, times, ...)

## S3 method for class 'prodlim'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'survfit'
predictRisk(object, newdata, times, ...)

## S3 method for class 'psm'
predictRisk(object, newdata, times, ...)

## S3 method for class 'ranger'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'rfsrc'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'FGR'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'riskRegression'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'ARR'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'CauseSpecificCox'
predictRisk(
  object,
  newdata,
  times,
  cause,
  product.limit = TRUE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'penfitS3'
predictRisk(object, newdata, times, ...)

## S3 method for class 'SuperPredictor'
predictRisk(object, newdata, ...)

## S3 method for class 'gbm'
predictRisk(object, newdata, times, ...)

## S3 method for class 'flexsurvreg'
predictRisk(object, newdata, times, ...)

## S3 method for class 'singleEventCB'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'wglm'
predictRisk(
  object,
  newdata,
  times = NULL,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

Arguments

object

A fitted model from which to extract predicted event probabilities.

newdata

A data frame containing predictor variable combinations for which to compute predicted event probabilities.

...

Additional arguments that are passed on to the current method.

times

A vector of times in the range of the response variable, for which the cumulative incidences event probabilities are computed.

cause

Identifies the cause of interest among the competing events.

iid

Should the iid decomposition be output using an attribute?

average.iid

Should the average iid decomposition be output using an attribute?

product.limit

If TRUE the survival is computed using the product limit estimator. Otherwise the exponential approximation is used (i.e. exp(-cumulative hazard)).

diag

when FALSE the hazard/cumlative hazard/survival for all observations at all times is computed, otherwise it is only computed for the i-th observation at the i-th time.

landmark

The starting time for the computation of the cumulative risk.

Details

In uncensored binary outcome data there is no need to choose a time point.

When operating on models for survival analysis (without competing risks) the function still predicts the risk, as 1 - S(t|X) where S(t|X) is survival chance of a subject characterized by X.

When there are competing risks (and the data are right censored) one needs to specify both the time horizon for prediction (can be a vector) and the cause of the event. The function then extracts the absolute risks F_c(t|X) aka the cumulative incidence of an event of type/cause c until time t for a subject characterized by X. Depending on the model it may or not be possible to predict the risk of all causes in a competing risks setting. For example. a cause-specific Cox (CSC) object allows to predict both cases whereas a Fine-Gray regression model (FGR) is specific to one of the causes.

Value

For binary outcome a vector with predicted risks. For survival outcome with and without competing risks a matrix with as many rows as NROW(newdata) and as many columns as length(times). Each entry is a probability and in rows the values should be increasing.

Author(s)

Thomas A. Gerds tag@biostat.ku.dk

Examples

## binary outcome
library(rms)
set.seed(7)
d <- sampleData(80,outcome="binary")
nd <- sampleData(80,outcome="binary")
fit <- lrm(Y~X1+X8,data=d)
predictRisk(fit,newdata=nd)
## Not run: 
library(SuperLearner)
set.seed(1)
sl = SuperLearner(Y = d$Y, X = d[,-1], family = binomial(),
      SL.library = c("SL.mean", "SL.glmnet", "SL.randomForest"))

## End(Not run)

## survival outcome
# generate survival data
library(prodlim)
set.seed(100)
d <- sampleData(100,outcome="survival")
d[,X1:=as.numeric(as.character(X1))]
d[,X2:=as.numeric(as.character(X2))]
# then fit a Cox model
library(rms)
cphmodel <- cph(Surv(time,event)~X1+X2,data=d,surv=TRUE,x=TRUE,y=TRUE)
# or via survival
library(survival)
coxphmodel <- coxph(Surv(time,event)~X1+X2,data=d,x=TRUE,y=TRUE)

# Extract predicted survival probabilities 
# at selected time-points:
ttt <- quantile(d$time)
# for selected predictor values:
ndat <- data.frame(X1=c(0.25,0.25,-0.05,0.05),X2=c(0,1,0,1))
# as follows
predictRisk(cphmodel,newdata=ndat,times=ttt)
predictRisk(coxphmodel,newdata=ndat,times=ttt)

# stratified cox model
sfit <- coxph(Surv(time,event)~strata(X1)+X2,data=d,x=TRUE,y=TRUE)
predictRisk(sfit,newdata=d[1:3,],times=c(1,3,5,10))

## simulate learning and validation data
learndat <- sampleData(100,outcome="survival")
valdat <- sampleData(100,outcome="survival")
## use the learning data to fit a Cox model
library(survival)
fitCox <- coxph(Surv(time,event)~X1+X2,data=learndat,x=TRUE,y=TRUE)
## suppose we want to predict the survival probabilities for all subjects
## in the validation data at the following time points:
## 0, 12, 24, 36, 48, 60
psurv <- predictRisk(fitCox,newdata=valdat,times=seq(0,60,12))
## This is a matrix with event probabilities (1-survival)
## one column for each of the 5 time points
## one row for each validation set individual

# Do the same for a randomSurvivalForest model
# library(randomForestSRC)
# rsfmodel <- rfsrc(Surv(time,event)~X1+X2,data=learndat)
# prsfsurv=predictRisk(rsfmodel,newdata=valdat,times=seq(0,60,12))
# plot(psurv,prsfsurv)

## Cox with ridge option
f1 <- coxph(Surv(time,event)~X1+X2,data=learndat,x=TRUE,y=TRUE)
f2 <- coxph(Surv(time,event)~ridge(X1)+ridge(X2),data=learndat,x=TRUE,y=TRUE)
## Not run: 
plot(predictRisk(f1,newdata=valdat,times=10),
     riskRegression:::predictRisk.coxph(f2,newdata=valdat,times=10),
     xlim=c(0,1),
     ylim=c(0,1),
     xlab="Unpenalized predicted survival chance at 10",
     ylab="Ridge predicted survival chance at 10")

## End(Not run)

## competing risks

library(survival)
library(riskRegression)
library(prodlim)
train <- prodlim::SimCompRisk(100)
test <- prodlim::SimCompRisk(10)
cox.fit  <- CSC(Hist(time,cause)~X1+X2,data=train)
predictRisk(cox.fit,newdata=test,times=seq(1:10),cause=1)

## with strata
cox.fit2  <- CSC(list(Hist(time,cause)~strata(X1)+X2,Hist(time,cause)~X1+X2),data=train)
predictRisk(cox.fit2,newdata=test,times=seq(1:10),cause=1)

riskRegression

Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks

v2020.12.08
GPL (>= 2)
Authors
Thomas Alexander Gerds [aut, cre], Paul Blanche [ctb], Rikke Mortensen [ctb], Marvin Wright [ctb], Nikolaj Tollenaar [ctb], John Muschelli [ctb], Ulla Brasch Mogensen [ctb], Brice Ozenne [aut] (<https://orcid.org/0000-0001-9694-2956>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.