Likelihood with data squashing and no zero counts
negLLsquash
computes the negative log-likelihood based on the
conditional marginal distribution of the counts, N, given that
N >= N*, where N* is the smallest count used for estimating the
hyperparameters. This function is minimized to estimate the hyperparameters
of the prior distribution. Use this function when zero counts are not used
and data squashing is used as described by DuMouchel et al. (2001). This
function is the likelihood function that should usually be chosen.
negLLsquash(theta, ni, ei, wi, N_star = 1)
theta |
A numeric vector of hyperparameters ordered as: α_1, β_1, α_2, β_2, P. |
ni |
A whole number vector of squashed actual counts from
|
ei |
A numeric vector of squashed expected counts from
|
wi |
A whole number vector of bin weights from |
N_star |
A scalar whole number for the minimum count size used. |
The conditional marginal distribution for the counts, N,
given that N >= N*, is based on a mixture of two negative binomial
distributions. The hyperparameters for the prior distribution (mixture of
gammas) are estimated by optimizing the likelihood equation from this
conditional marginal distribution. It is recommended to use N_star =
1
when practical.
The hyperparameters are:
α_1, β_1: Parameters of the first component of the marginal distribution of the counts (also the prior distribution)
α_2, β_2: Parameters of the second component
P: Mixture fraction
This function will not need to be called directly if using
exploreHypers
or autoHyper
.
A scalar negative log-likelihood value
Make sure N_star matches the smallest actual count in ni before using this function. Filter ni, ei, and wi if needed.
Make sure the data were actually squashed (see squashData
)
before using this function.
DuMouchel W, Pregibon D (2001). "Empirical Bayes Screening for Multi-item Associations." In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '01, pp. 67-76. ACM, New York, NY, USA. ISBN 1-58113-391-X.
nlm
, nlminb
, and
optim
for optimization and squashData
for data squashing
Other negative log-likelihood functions:
negLLzeroSquash()
,
negLLzero()
,
negLL()
theta_init <- c(0.2, 0.1, 2, 4, 1/3) #initial guess data(caers) proc <- processRaw(caers) squashed <- squashData(proc, count = 1, bin_size = 100, keep_pts = 100) squashed <- squashData(squashed, count = 2, bin_size = 10, keep_pts = 20) negLLsquash(theta = theta_init, ni = squashed$N, ei = squashed$E, wi = squashed$weight) #For hyperparameter estimation... stats::nlminb(start = theta_init, objective = negLLsquash, ni = squashed$N, ei = squashed$E, wi = squashed$weight)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.