Data Matrix and Weights for Discrete Subdistribution Hazard Models
Generates the augmented data matrix and the weights required for discrete subdistribution hazard modeling with right censoring.
dataLongSubDist(dataSet, timeColumn, eventColumns, eventFocus, timeAsFactor=TRUE)
dataSet |
Original data in short format. Must be of class "data.frame". |
timeColumn |
Character specifying the column name of the observed event times. It is required that the observed times are discrete (integer). |
eventColumns |
Character vector specifying the column names of the event indicators (excluding censoring events). It is required that a 0-1 coding is used for all events. The algorithm treats row sums of zero of all event columns as censored. |
eventFocus |
Column name of the event of interest (type 1 event). |
timeAsFactor |
Logical indicating whether time should be coded as a factor in the augmented data matrix. If FALSE, a numeric coding will be used. |
This function sets up the augmented data matrix and the weights that are needed for weighted maximum likelihood (ML) estimation of the discrete subdistribution model proposed by Berger et al. (2018). The model is a discrete-time extension of the original subdistribution model proposed by Fine and Gray (1999).
Data frame with additional column "subDistWeights". The latter column contains the weights that are needed for fitting a weighted binary regression model, as described in Berger et al. (2018). The weights are calculated by a life table estimator for the censoring event.
Thomas Welchowski welchow@imbie.meb.uni-bonn.de
Moritz Berger, Matthias Schmid, Thomas Welchowski, Steffen Schmitz-Valckenberg and Jan Beyersmann, (2018), Subdistribution Hazard Models for Competing Risks in Discrete Time, Biostatistics, Doi: 10.1093/biostatistics/kxy069
Jason P. Fine and Robert J. Gray, (1999), A proportional hazards model for the subdistribution of a competing risk, Journal of the American Statistical Association 94, pages 496-509.
# Example with unemployment data
library(Ecdat)
data(UnempDur)
# Generate subsample, reduce number of intervals to k = 5
SubUnempDur <- UnempDur [1:500, ]
SubUnempDur$time <- as.numeric(cut(SubUnempDur$spell, c(0,4,8,16,28)))
# Convert competing risks data to long format
# The event of interest is re-employment at full job
SubUnempDurLong <- dataLongSubDist (dataSet=SubUnempDur, timeColumn="time",
eventColumns=c("censor1", "censor2", "censor3"), eventFocus="censor1")
head(SubUnempDurLong)
# Fit discrete subdistribution hazard model with logistic link function
logisticSubDistr <- glm(y ~ timeInt + ui + age + logwage,
family=binomial(), data = SubUnempDurLong,
weights=SubUnempDurLong$subDistWeights)
summary(logisticSubDistr)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.