surveillance: LRCUSUM.runlength – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

LRCUSUM.runlength

Run length computation of a CUSUM detector

Description

Compute run length for a count data or categorical CUSUM. The computations are based on a Markov representation of the likelihood ratio based CUSUM.

Usage

LRCUSUM.runlength(mu,mu0,mu1,h,dfun, n, g=5,outcomeFun=NULL,...)

Arguments

`mu`	k-1 \times T matrix with true proportions, i.e. equal to mu0 or mu1 if one wants to compute e.g. ARL_0 or ARL_1.
`mu0`	k-1 \times T matrix with in-control proportions
`mu1`	k-1 \times T matrix with out-of-control proportion
`h`	The threshold h which is used for the CUSUM.
`dfun`	The probability mass function or density used to compute the likelihood ratios of the CUSUM. In a negative binomial CUSUM this is `dnbinom`, in a binomial CUSUM `dbinom` and in a multinomial CUSUM `dmultinom`.
`n`	Vector of length T containing the total number of experiments for each time point.
`g`	The number of levels to cut the state space into when performing the Markov chain approximation. Sometimes also denoted M. Note that the quality of the approximation depends very much on g. If T greater than, say, 50 its necessary to increase the value of g.
`outcomeFun`	A hook `function (k,n)` to compute all possible outcome states to compute the likelihood ratio for. If `NULL` then the internal default function `surveillance:::outcomeFunStandard` is used. This function uses the Cartesian product of `0:n` for `k` components.
`...`	Additional arguments to send to `dfun`.

Details

Brook and Evans (1972) formulated an approximate approach based on Markov chains to determine the PMF of the run length of a time-constant CUSUM detector. They describe the dynamics of the CUSUM statistic by a Markov chain with a discretized state space of size g+2. This is adopted to the time varying case in Höhle (2010) and implemented in R using the ... notation such that it works for a very large class of distributions.

Value

A list with five components

`P`	An array of g+2 \times g+2 transition matrices of the approximation Markov chain.
`pmf`	Probability mass function (up to length T) of the run length variable.
`cdf`	Cumulative density function (up to length T) of the run length variable.
`arl`	If the model is time homogenous (i.e. if T==1) then the ARL is computed based on the stationary distribution of the Markov chain. See the eqns in the reference for details. Note: If the model is not time homogeneous then the function returns `NA` and the ARL has to be approximated manually from the output. One could use `sum(1:length(pmf) * pmf)`, which is an approximation because of using a finite support for a sum which should be from 1 to infinity.

Author(s)

M. Höhle

References

Höhle, M. (2010): Online change-point detection in categorical time series. In: T. Kneib and G. Tutz (Eds.), Statistical Modelling and Regression Structures - Festschrift in Honour of Ludwig Fahrmeir, Physica-Verlag, pp. 377-397. Preprint available as https://staff.math.su.se/hoehle/pubs/hoehle2010-preprint.pdf

Höhle, M. and Mazick, A. (2010): Aberration detection in R illustrated by Danish mortality monitoring. In: T. Kass-Hout and X. Zhang (Eds.), Biosurveillance: A Health Protection Priority, CRCPress. Preprint available as https://staff.math.su.se/hoehle/pubs/hoehle_mazick2009-preprint.pdf

Brook, D. and Evans, D. A. (1972): An approach to the probability distribution of cusum run length. Biometrika 59(3):539-549.

Examples

######################################################
#Run length of a time constant negative binomial CUSUM
######################################################

#In-control and out of control parameters
mu0 <- 10
alpha <- 1/2
kappa <- 2

#Density for comparison in the negative binomial distribution
dY <- function(y,mu,log=FALSE, alpha, ...) {
  dnbinom(y, mu=mu, size=1/alpha, log=log)
}

#In this case "n" is the maximum value to investigate the LLR for
#It is assumed that beyond n the LLR is too unlikely to be worth
#computing.
LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=5,
  dfun = dY, n=rep(100,length(mu0)), alpha=alpha)

h.grid <- seq(3,6,by=0.3)
arls <- sapply(h.grid, function(h) {
  LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=h,
  dfun = dY, n=rep(100,length(mu0)), alpha=alpha,g=20)$arl
})
plot(h.grid, arls,type="l",xlab="threshold h",ylab=expression(ARL[0]))


if (surveillance.options("allExamples"))
{
  ######################################################
  #Run length of a time varying negative binomial CUSUM
  ######################################################

  mu0 <- matrix(5*sin(2*pi/52 * 1:200) + 10,ncol=1)
  
  rl <- LRCUSUM.runlength( mu=t(mu0), mu0=t(mu0), mu1=kappa*t(mu0), h=2,
    dfun = dY, n=rep(100,length(mu0)), alpha=alpha,g=20)

  plot(1:length(mu0),rl$pmf,type="l",xlab="t",ylab="PMF")
  plot(1:length(mu0),rl$cdf,type="l",xlab="t",ylab="CDF")
}


########################################################
# Further examples contain the binomial, beta-binomial
# and multinomial CUSUMs. Hopefully, these will be added
# in the future.
########################################################

#dfun function for the multinomial distribution (Note: Only k-1 categories are specified).
dmult <- function(y, size,mu, log = FALSE) {
    return(dmultinom(c(y,size-sum(y)), size = size, prob=c(mu,1-sum(mu)), log = log))
}

#Example for the time-constant multinomial distribution
#with size 100 and in-control and out-of-control parameters as below.
n <- 100
pi0 <- as.matrix(c(0.5,0.3,0.2))
pi1 <- as.matrix(c(0.38,0.46,0.16))

#ARL_0
LRCUSUM.runlength(mu=pi0[1:2,,drop=FALSE],mu0=pi0[1:2,,drop=FALSE],mu1=pi1[1:2,,drop=FALSE],
                  h=5,dfun=dmult, n=n, g=15)$arl
#ARL_1
LRCUSUM.runlength(mu=pi1[1:2,,drop=FALSE],mu0=pi0[1:2,,drop=FALSE],mu1=pi1[1:2,,drop=FALSE],
                  h=5,dfun=dmult, n=n, g=15)$arl

surveillance

Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

v1.19.1

GPL-2

Authors

Michael H<f6>hle [aut, ths] (<https://orcid.org/0000-0002-0423-6702>), Sebastian Meyer [aut, cre] (<https://orcid.org/0000-0002-1791-9449>), Michaela Paul [aut], Leonhard Held [ctb, ths], Howard Burkom [ctb], Thais Correa [ctb], Mathias Hofmann [ctb], Christian Lang [ctb], Juliane Manitz [ctb], Andrea Riebler [ctb], Daniel Saban<e9>s Bov<e9> [ctb], Ma<eb>lle Salmon [ctb], Dirk Schumacher [ctb], Stefan Steiner [ctb], Mikko Virtanen [ctb], Wei Wei [ctb], Valentin Wimmer [ctb], R Core Team [ctb] (A few code segments are modified versions of code from base R)

Initial release

2021-03-30