riskRegression: predictRisk – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

predictRisk

Extrating predicting risks from regression models

Description

Extract event probabilities from fitted regression models and machine learning objects. The function predictRisk is a generic function, meaning that it invokes specifically designed functions depending on the 'class' of the first argument. See predictRisk.

Usage

predictRisk(object, newdata, ...)

## Default S3 method:
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'double'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'integer'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'factor'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'numeric'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'glm'
predictRisk(object, newdata, iid = FALSE, average.iid = FALSE, ...)

## S3 method for class 'formula'
predictRisk(object, newdata, ...)

## S3 method for class 'BinaryTree'
predictRisk(object, newdata, ...)

## S3 method for class 'lrm'
predictRisk(object, newdata, ...)

## S3 method for class 'rpart'
predictRisk(object, newdata, ...)

## S3 method for class 'randomForest'
predictRisk(object, newdata, ...)

## S3 method for class 'matrix'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'aalen'
predictRisk(object, newdata, times, ...)

## S3 method for class 'cox.aalen'
predictRisk(object, newdata, times, ...)

## S3 method for class 'coxph'
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'coxphTD'
predictRisk(object, newdata, times, landmark, ...)

## S3 method for class 'CSCTD'
predictRisk(object, newdata, times, cause, landmark, ...)

## S3 method for class 'coxph.penal'
predictRisk(object, newdata, times, ...)

## S3 method for class 'cph'
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'selectCox'
predictRisk(object, newdata, times, ...)

## S3 method for class 'prodlim'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'survfit'
predictRisk(object, newdata, times, ...)

## S3 method for class 'psm'
predictRisk(object, newdata, times, ...)

## S3 method for class 'ranger'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'rfsrc'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'FGR'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'riskRegression'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'ARR'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'CauseSpecificCox'
predictRisk(
  object,
  newdata,
  times,
  cause,
  product.limit = TRUE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

## S3 method for class 'penfitS3'
predictRisk(object, newdata, times, ...)

## S3 method for class 'SuperPredictor'
predictRisk(object, newdata, ...)

## S3 method for class 'gbm'
predictRisk(object, newdata, times, ...)

## S3 method for class 'flexsurvreg'
predictRisk(object, newdata, times, ...)

## S3 method for class 'singleEventCB'
predictRisk(object, newdata, times, cause, ...)

## S3 method for class 'wglm'
predictRisk(
  object,
  newdata,
  times = NULL,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

Arguments

`object`	A fitted model from which to extract predicted event probabilities.
`newdata`	A data frame containing predictor variable combinations for which to compute predicted event probabilities.
`...`	Additional arguments that are passed on to the current method.
`times`	A vector of times in the range of the response variable, for which the cumulative incidences event probabilities are computed.
`cause`	Identifies the cause of interest among the competing events.
`iid`	Should the iid decomposition be output using an attribute?
`average.iid`	Should the average iid decomposition be output using an attribute?
`product.limit`	If `TRUE` the survival is computed using the product limit estimator. Otherwise the exponential approximation is used (i.e. exp(-cumulative hazard)).
`diag`	when `FALSE` the hazard/cumlative hazard/survival for all observations at all times is computed, otherwise it is only computed for the i-th observation at the i-th time.
`landmark`	The starting time for the computation of the cumulative risk.

Details

In uncensored binary outcome data there is no need to choose a time point.

When operating on models for survival analysis (without competing risks) the function still predicts the risk, as 1 - S(t|X) where S(t|X) is survival chance of a subject characterized by X.

When there are competing risks (and the data are right censored) one needs to specify both the time horizon for prediction (can be a vector) and the cause of the event. The function then extracts the absolute risks F_c(t|X) aka the cumulative incidence of an event of type/cause c until time t for a subject characterized by X. Depending on the model it may or not be possible to predict the risk of all causes in a competing risks setting. For example. a cause-specific Cox (CSC) object allows to predict both cases whereas a Fine-Gray regression model (FGR) is specific to one of the causes.

Value

For binary outcome a vector with predicted risks. For survival outcome with and without competing risks a matrix with as many rows as NROW(newdata) and as many columns as length(times). Each entry is a probability and in rows the values should be increasing.

Author(s)

Thomas A. Gerds tag@biostat.ku.dk

Examples

## binary outcome
library(rms)
set.seed(7)
d <- sampleData(80,outcome="binary")
nd <- sampleData(80,outcome="binary")
fit <- lrm(Y~X1+X8,data=d)
predictRisk(fit,newdata=nd)
## Not run: 
library(SuperLearner)
set.seed(1)
sl = SuperLearner(Y = d$Y, X = d[,-1], family = binomial(),
      SL.library = c("SL.mean", "SL.glmnet", "SL.randomForest"))

## End(Not run)

## survival outcome
# generate survival data
library(prodlim)
set.seed(100)
d <- sampleData(100,outcome="survival")
d[,X1:=as.numeric(as.character(X1))]
d[,X2:=as.numeric(as.character(X2))]
# then fit a Cox model
library(rms)
cphmodel <- cph(Surv(time,event)~X1+X2,data=d,surv=TRUE,x=TRUE,y=TRUE)
# or via survival
library(survival)
coxphmodel <- coxph(Surv(time,event)~X1+X2,data=d,x=TRUE,y=TRUE)

# Extract predicted survival probabilities 
# at selected time-points:
ttt <- quantile(d$time)
# for selected predictor values:
ndat <- data.frame(X1=c(0.25,0.25,-0.05,0.05),X2=c(0,1,0,1))
# as follows
predictRisk(cphmodel,newdata=ndat,times=ttt)
predictRisk(coxphmodel,newdata=ndat,times=ttt)

# stratified cox model
sfit <- coxph(Surv(time,event)~strata(X1)+X2,data=d,x=TRUE,y=TRUE)
predictRisk(sfit,newdata=d[1:3,],times=c(1,3,5,10))

## simulate learning and validation data
learndat <- sampleData(100,outcome="survival")
valdat <- sampleData(100,outcome="survival")
## use the learning data to fit a Cox model
library(survival)
fitCox <- coxph(Surv(time,event)~X1+X2,data=learndat,x=TRUE,y=TRUE)
## suppose we want to predict the survival probabilities for all subjects
## in the validation data at the following time points:
## 0, 12, 24, 36, 48, 60
psurv <- predictRisk(fitCox,newdata=valdat,times=seq(0,60,12))
## This is a matrix with event probabilities (1-survival)
## one column for each of the 5 time points
## one row for each validation set individual

# Do the same for a randomSurvivalForest model
# library(randomForestSRC)
# rsfmodel <- rfsrc(Surv(time,event)~X1+X2,data=learndat)
# prsfsurv=predictRisk(rsfmodel,newdata=valdat,times=seq(0,60,12))
# plot(psurv,prsfsurv)

## Cox with ridge option
f1 <- coxph(Surv(time,event)~X1+X2,data=learndat,x=TRUE,y=TRUE)
f2 <- coxph(Surv(time,event)~ridge(X1)+ridge(X2),data=learndat,x=TRUE,y=TRUE)
## Not run: 
plot(predictRisk(f1,newdata=valdat,times=10),
     riskRegression:::predictRisk.coxph(f2,newdata=valdat,times=10),
     xlim=c(0,1),
     ylim=c(0,1),
     xlab="Unpenalized predicted survival chance at 10",
     ylab="Ridge predicted survival chance at 10")

## End(Not run)

## competing risks

library(survival)
library(riskRegression)
library(prodlim)
train <- prodlim::SimCompRisk(100)
test <- prodlim::SimCompRisk(10)
cox.fit  <- CSC(Hist(time,cause)~X1+X2,data=train)
predictRisk(cox.fit,newdata=test,times=seq(1:10),cause=1)

## with strata
cox.fit2  <- CSC(list(Hist(time,cause)~strata(X1)+X2,Hist(time,cause)~X1+X2),data=train)
predictRisk(cox.fit2,newdata=test,times=seq(1:10),cause=1)

riskRegression

Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks

v2020.12.08

GPL (>= 2)

Authors

Thomas Alexander Gerds [aut, cre], Paul Blanche [ctb], Rikke Mortensen [ctb], Marvin Wright [ctb], Nikolaj Tollenaar [ctb], John Muschelli [ctb], Ulla Brasch Mogensen [ctb], Brice Ozenne [aut] (<https://orcid.org/0000-0001-9694-2956>)

Initial release