openEBGM: negLLsquash – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

negLLsquash

Likelihood with data squashing and no zero counts

Description

negLLsquash computes the negative log-likelihood based on the conditional marginal distribution of the counts, N, given that N >= N*, where N* is the smallest count used for estimating the hyperparameters. This function is minimized to estimate the hyperparameters of the prior distribution. Use this function when zero counts are not used and data squashing is used as described by DuMouchel et al. (2001). This function is the likelihood function that should usually be chosen.

Usage

negLLsquash(theta, ni, ei, wi, N_star = 1)

Arguments

`theta`	A numeric vector of hyperparameters ordered as: α_1, β_1, α_2, β_2, P.
`ni`	A whole number vector of squashed actual counts from `squashData`.
`ei`	A numeric vector of squashed expected counts from `squashData`.
`wi`	A whole number vector of bin weights from `squashData`.
`N_star`	A scalar whole number for the minimum count size used.

Details

The conditional marginal distribution for the counts, N, given that N >= N*, is based on a mixture of two negative binomial distributions. The hyperparameters for the prior distribution (mixture of gammas) are estimated by optimizing the likelihood equation from this conditional marginal distribution. It is recommended to use N_star = 1 when practical.

The hyperparameters are:

α_1, β_1: Parameters of the first component of the marginal distribution of the counts (also the prior distribution)
α_2, β_2: Parameters of the second component
P: Mixture fraction

This function will not need to be called directly if using exploreHypers or autoHyper.

Value

A scalar negative log-likelihood value

Warnings

Make sure N_star matches the smallest actual count in ni before using this function. Filter ni, ei, and wi if needed.

Make sure the data were actually squashed (see squashData) before using this function.

References

DuMouchel W, Pregibon D (2001). "Empirical Bayes Screening for Multi-item Associations." In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '01, pp. 67-76. ACM, New York, NY, USA. ISBN 1-58113-391-X.

Examples

theta_init <- c(0.2, 0.1, 2, 4, 1/3)  #initial guess
data(caers)
proc <- processRaw(caers)
squashed <- squashData(proc, count = 1, bin_size = 100, keep_pts = 100)
squashed <- squashData(squashed, count = 2, bin_size = 10, keep_pts = 20)
negLLsquash(theta = theta_init, ni = squashed$N, ei = squashed$E,
            wi = squashed$weight)
#For hyperparameter estimation...
stats::nlminb(start = theta_init, objective = negLLsquash, ni = squashed$N,
              ei = squashed$E, wi = squashed$weight)

openEBGM

EBGM Disproportionality Scores for Adverse Event Data Mining

v0.8.3

GPL-2 | GPL-3

Authors

John Ihrie [cre, aut], Travis Canida [aut], Ismaïl Ahmed [ctb] (author of 'PhViD' package (derived code)), Antoine Poncet [ctb] (author of 'PhViD'), Sergio Venturini [ctb] (author of 'mederrRank' package (derived code)), Jessica Myers [ctb] (author of 'mederrRank')

Initial release