npcure: beran – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

npcure

beran

Compute Beran's Estimator of the Conditional Survival

Description

This function computes the Beran nonparametric estimator of the conditional survival function.

Usage

beran(x, t, d, dataset, x0, h, local = TRUE, testimate = NULL,
conflevel = 0L, cvbootpars = if (conflevel == 0 && !missing(h)) NULL
else controlpars())

Arguments

`x`	If `dataset` is missing, a numeric object giving the covariate values. If `dataset` is a data frame, it is interpreted as the name of the variable corresponding to the covariate in the data frame.
`t`	If `dataset` is missing, a numeric object giving the observed times. If `dataset` is a data frame, it is interpreted as the name of the variable corresponding to the observed times in the data frame.
`d`	If `dataset` is missing, an integer object giving the values of the uncensoring indicator. Censored observations must be coded as 0, uncensored ones as 1. If `dataset` is a data frame, it is interpreted as the name of the variable corresponding to the uncensoring indicator.
`dataset`	An optional data frame in which the variables named in `x`, `t` and `d` are interpreted. If it is missing, `x`, `t` and `d` must be objects of the workspace.
`x0`	A numeric vector of covariate values where the survival estimates will be computed.
`h`	A numeric vector of bandwidths. If it is missing the default is to use the cross-validation bandwidth computed by the `berancv` function.
`local`	A logical value, `TRUE` by default, specifying whether local or global bandwidths are used.
`testimate`	A numeric vector specifying the times at which the survival is estimated. By default it is `NULL`, and then the survival is estimated at the times given by `t`.
`conflevel`	A value controlling whether bootstrap confidence intervals (CI) of the survival are to be computed. With the default value, 0L, the CIs are not computed. If a numeric value between 0 and 1 is passed, it specifies the confidence level of the CIs.
`cvbootpars`	A list of parameters controlling the bootstrap when computing the CIs of the survival: `B`, the number of bootstrap resamples, and `nnfrac`, the fraction of the sample size that determines the order of the nearest neighbor used for choosing a pilot bandwidth. If `h` is missing the list of parameters is extended to be the same used for computing the cross-validation bandwidth (see the help of `berancv` for details). The default is the value returned by the `controlpars` function called without arguments. In case the CIs are not computed and `h` is not missing the default is `NULL`.

Details

This function computes the kernel type product-limit estimator of the conditional survival function S(t | x) = P(Y > t | X = x) under censoring, using the Nadaraya-Watson weights. The kernel used is the Epanechnikov. If the smoothing parameter h is not provided, then the cross-validation bandwidth selector in Geerdens et al. (2018) is used. The function is available only for one continuous covariate X.

Value

An object of S3 class 'npcure'. Formally, a list of components:

`type`	The constant string "survival".
`local`	The value of the `local` argument.
`h`	The value of the `h` argument, unless this is missing, in which case its value is that of the cross-validation bandwidth.
`x0`	The value of the `x0` argument.
`testim`	The numeric vector of time values where the survival function is estimated.
`S`	A list whose components are the estimates of the survival function for each one of the covariate values, i.e., those specified by the `x0` argument. The survival estimates are given at the times determined by the `testimate` argument.

Author(s)

Ignacio López-de-Ullibarri [aut, cre], Ana López-Cheda [aut], Maria Amalia Jácome [aut]

References

Beran, R. (1981). Nonparametric regression with randomly censored survival data. Technical report, University of California, Berkeley.

Geerdens, C., Acar, E. F., Janssen, P. (2018). Conditional copula models for right-censored clustered event time data. Biostatistics, 19(2): 247-262. https://doi.org/10.1093/biostatistics/kxx034.

Examples

## Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) ## Covariate values
y <- rweibull(n, shape = .5*(x + 4)) ## True lifetimes
c <- rexp(n) ## Censoring values
p <- exp(2*x)/(1 + exp(2*x)) ## Probability of being susceptible
u <- runif(n)
t <- ifelse(u < p, pmin(y, c), c) ## Observed times
d <- ifelse(u < p, ifelse(y < c, 1, 0), 0) ## Uncensoring indicator
data <- data.frame(x = x, t = t, d = d)

## Survival estimates for covariate values 0, 0.5 using...
## ... (a) global bandwidths 0.3, 0.5, 1.
## By default, the estimates are computed at the observed times
x0 <- c(0, .5)
S1 <- beran(x, t, d, data, x0 = x0, h = c(.3, .5, 1), local = FALSE) 

## Plot predicted survival curves for covariate value 0.5
plot(S1$testim, S1$S$h0.3$x0.5, type = "s", xlab = "Time", ylab =
"Survival", ylim = c(0, 1)) 
lines(S1$testim, S1$S$h0.5$x0.5, type = "s", lty = 2)
lines(S1$testim, S1$S$h1$x0.5, type = "s", lty = 3)
## The true survival curve is plotted for reference
p0 <- exp(2*x0[2])/(1 + exp(2*x0[2]))
lines(S1$testim, 1 - p0 + p0*pweibull(S1$testim, shape = .5*(x0[2] + 4),
lower.tail = FALSE), col = 2)
legend("topright", c("Estimate, h = 0.3", "Estimate, h = 0.5",
"Estimate, h = 1", "True"), lty = c(1:3, 1), col = c(rep(1, 3), 2))

## As before, but with estimates computed at fixed times 0.1, 0.2,...,1
S2 <- beran(x, t, d, data, x0 = x0, h = c(.3, .5, 1), local = FALSE,
testimate = .1*(1:10))

## ... (b) local bandwidths 0.3, 0.5.
## Note that the length of the covariate vector x0 and the bandwidth h
## must be the same.
S3 <- beran(x, t, d, data, x0 = x0, h = c(.3, .5), local = TRUE)

## ... (c) the cross-validation (CV) bandwidth selector (the default
## when the bandwidth argument is not provided). 
## The CV bandwidth is searched in a grid of 150 bandwidths (hl = 150)
## between 0.2 and 2 times the standardized interquartile range
## of the covariate values (hbound = c(.2, 2)).
## 95% confidence intervals are also given.
S4 <- beran(x, t, d, data, x0 = x0, conflevel = .95, cvbootpars =
controlpars(hl = 150, hbound = c(.2, 2))) 
     
## Plot of predicted survival curve and confidence intervals for
## covariate value 0.5 
plot(S4$testim, S4$S$x0.5, type = "s", xlab = "Time", ylab = "Survival",
ylim = c(0, 1))
lines(S4$testim, S4$conf$x0.5$lower, type = "s", lty = 2)
lines(S4$testim, S4$conf$x0.5$upper, type = "s", lty = 2)
lines(S4$testim, 1 - p0 + p0 * pweibull(S4$testim, shape = .5*(x0[2] +
4), lower.tail = FALSE), col = 2) 
legend("topright", c("Estimate with CV bandwidth", "95% CI limits",
"True"), lty = c(1, 2, 1), col = c(1, 1, 2))


## Example with the dataset 'bmt' in the 'KMsurv' package
## to study the survival of patients aged 25 and 40.
data("bmt", package = "KMsurv")
x0 <- c(25, 40)
S <- beran(z1, t2, d3, bmt, x0 = x0, conflevel = .95)
## Plot of predicted survival curves and confidence intervals
plot(S$testim, S$S$x25, type = "s", xlab = "Time", ylab = "Survival",
ylim = c(0, 1))
lines(S$testim, S$conf$x25$lower, type = "s", lty = 2)
lines(S$testim, S$conf$x25$upper, type = "s", lty = 2)
lines(S$testim, S$S$x40, type = "s", lty = 1, col = 2)
lines(S$testim, S$conf$x40$lower, type = "s", lty = 2, col = 2)
lines(S$testim, S$conf$x40$upper, type = "s", lty = 2, col = 2)
legend("topright", c("Age 25: Estimate", "Age 25: 95% CI limits",
"Age 40: Estimate", "Age 40: 95% CI limits"), lty = 1:2,
col = c(1, 1, 2, 2))

npcure

Nonparametric Estimation in Mixture Cure Models

v0.1-5

GPL (>= 2)

Authors

Ignacio López-de-Ullibarri [aut, cre], Ana López-Cheda [aut], Maria Amalia Jácome [aut]

Initial release

2020-02-28

beran

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

npcure

We don't support your browser anymore