ctmle: ctmleGeneral – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

ctmle

ctmleGeneral

General Template for Collaborative Targeted Maximum Likelihood Estimation

Description

This function computes the Collaborative Targeted Maximum Likelihood Estimator.

Usage

ctmleGeneral(Y, A, W, Wg = W, Q, ctmletype, gn_candidates,
  gn_candidates_cv = NULL, alpha = 0.995, family = "gaussian",
  gbound = 0.025, like_type = "RSS", fluctuation = "logistic",
  verbose = FALSE, detailed = FALSE, PEN = FALSE, g1W = NULL,
  g1WPrev = NULL, V = 5, folds = NULL, stopFactor = 10^6)

Arguments

`Y`	continuous or binary outcome variable
`A`	binary treatment indicator, 1 for treatment, 0 for control
`W`	vector, matrix, or dataframe containing baseline covariates for Q bar
`Wg`	vector, matrix, or dataframe containing baseline covariates for propensity score model (defaults to W if not supplied by user)
`Q`	n by 2 matrix of initial values for Q0W, Q1W in columns 1 and 2, respectively. Current version does not support SL for automatic initial estimation of Q bar
`ctmletype`	1 or 3. Type of general C-TMLE. Type 1 uses cross-validation to select best gn, while Type 3 directly solves extra clever covariates.
`gn_candidates`	matrix or dataframe, each column stand for a estimate of propensity score. Estimate in the column with larger index should have smaller empirical loss
`gn_candidates_cv`	matrix or dataframe, each column stand for a the cross-validated estimate. For example, the (i,j)-th element is the predicted propensity score by j-th estimator, for i-th observation, when it is in the validation set
`alpha`	used to keep predicted initial values bounded away from (0,1) for logistic fluctuation, 0.995 (default)
`family`	family specification for working regression models, generally 'gaussian' for continuous outcomes (default), 'binomial' for binary outcomes
`gbound`	bound on P(A=1\|W), defaults to 0.025
`like_type`	'RSS' or 'loglike'. The metric to use for forward selection and cross-validation
`fluctuation`	'logistic' (default) or 'linear', for targeting step
`verbose`	print status messages if TRUE
`detailed`	boolean number. If it is TRUE, return more detailed results
`PEN`	boolean. If true, penalized loss is used in cross-validation step
`g1W`	Only used when type is 3. a user-supplied propensity score estimate.
`g1WPrev`	Only used when type is 3. a user-supplied propensity score estimate, with small fluctuation compared to g1W.
`V`	Number of folds. Only used if folds is not specified
`folds`	The list of indices for cross-validation step. We recommend the cv-splits in C-TMLE matchs that in gn_candidate_cv
`stopFactor`	Numerical value with default 1e6. If the current empirical likelihood is stopFactor times larger than the best previous one, the construction would stop

Value

best_k the index of estimate that selected by cross-validation

est estimate of psi_0

CI IC-based 95

pvalue pvalue for the null hypothesis that Psi = 0

likelihood sum of squared residuals, based on selected estimator evaluated on all obs or, logistic loglikelihood if like_type != "RSS"

varIC empirical variance of the influence curve adjusted for estimation of g

varDstar empirical variance of the influence curve

var.psi variance of the estimate

varIC.cv cross-validated variance of the influence curve

penlikelihood.cv penalized cross-validatedlikelihood

cv.res all cross-validation results for each fold

Examples

N <- 1000
p = 100
V = 5
Wmat <- matrix(rnorm(N * p), ncol = p)
gcoef <- matrix(c(-1,-1,rep(-(3/((p)-2)),(p)-2)),ncol=1)

W <- as.data.frame(Wmat)
g <- 1/(1+exp(Wmat%*%gcoef / 3))
A <- rbinom(N, 1, prob = g)

# Build potential outcome pairs, and the observed outcome Y
beta1 <- 4+2*Wmat[,1]+2*Wmat[,2]+2*Wmat[,5]+2*Wmat[,6]+2*Wmat[,8]
beta0 <- 2+2*Wmat[,1]+2*Wmat[,2]+2*Wmat[,5]+2*Wmat[,6]+2*Wmat[,8]

tau = 2
sigma <- 1
epsilon <-rnorm(N,0,sigma)
Y  <- beta0 + tau * A + epsilon
# Initial estimate of Q
Q <- cbind(rep(mean(Y[A == 1]), N), rep(mean(Y[A == 0]), N))

folds <-by(sample(1:N,N), rep(1:V, length=N), list)

lasso_fit <- cv.glmnet(x = as.matrix(W), y = A, alpha = 1, nlambda = 100, nfolds = 10)
lasso_lambdas <- lasso_fit$lambda[lasso_fit$lambda <= lasso_fit$lambda.min][1:5]
# Build template for glmnet
SL.glmnet_new <- function (Y, X, newX, family, obsWeights, id, alpha = 1,
                          nlambda = 100, lambda = 0,...)
{
    # browser()
    if (!is.matrix(X)) {
          X <- model.matrix(~-1 + ., X)
         newX <- model.matrix(~-1 + ., newX)
   }
   fit <- glmnet::glmnet(x = X, y = Y,
                         lambda = lambda,
                         family = family$family, alpha = alpha)
   pred <- predict(fit, newx = newX, type = "response")
     fit <- list(object = fit)
   class(fit) <- "SL.glmnet"
   out <- list(pred = pred, fit = fit)
   return(out)
}

# Use a sequence of LASSO estimators to build gn sequence:
SL.cv1lasso <- function (... , alpha = 1, lambda = lasso_lambdas[1]){
   SL.glmnet_new(... , alpha = alpha, lambda = lambda)
}

SL.cv2lasso <- function (... , alpha = 1, lambda = lasso_lambdas[2]){
    SL.glmnet_new(... , alpha = alpha, lambda = lambda)
}

SL.cv3lasso <- function (... , alpha = 1, lambda = lasso_lambdas[3]){
    SL.glmnet_new(... , alpha = alpha, lambda = lambda)
}

SL.cv4lasso <- function (... , alpha = 1, lambda = lasso_lambdas[4]){
     SL.glmnet_new(... , alpha = alpha, lambda = lambda)
}

SL.library = c('SL.cv1lasso', 'SL.cv2lasso', 'SL.cv3lasso', 'SL.cv4lasso', 'SL.glm')

# Build the sequence. See more details in the function build_gn_seq, and package SuperLearner
gn_seq <- build_gn_seq(A = A, W = W, SL.library = SL.library, folds = folds)


# Use the output of build_gn_seq for ctmle general templates
ctmle_fit <- ctmleGeneral(Y = Y, A = A, W = W, Q = Q, ctmletype = 1,
                     gn_candidates = gn_seq$gn_candidates,
                     gn_candidates_cv = gn_seq$gn_candidates_cv,
                     folds = gn_seq$folds, V = length(folds))

ctmle

Collaborative Targeted Maximum Likelihood Estimation

v0.1.2

GPL-2

Authors

Cheng Ju [aut, cre], Susan Gruber [aut], Richard Wyss [ctb], Jenny Haggstrom [ctb], Mark van der Laan [aut, ths]

Initial release

2019-12-08

ctmleGeneral

Description

Usage

Arguments

Value

Examples

ctmle

We don't support your browser anymore