Categorical distribution
Probability mass function, distribution function, quantile function and random generation for the categorical distribution.
dcat(x, prob, log = FALSE) pcat(q, prob, lower.tail = TRUE, log.p = FALSE) qcat(p, prob, lower.tail = TRUE, log.p = FALSE, labels) rcat(n, prob, labels) rcatlp(n, log_prob, labels)
x, q |
vector of quantiles. |
prob, log_prob |
vector of length m, or m-column matrix
of non-negative weights (or their logarithms in |
log, log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are P[X ≤ x] otherwise, P[X > x]. |
p |
vector of probabilities. |
labels |
if provided, labeled |
n |
number of observations. If |
Probability mass function
Pr(X = k) = w[k]/sum(w)
Cumulative distribution function
Pr(X <= k) = sum(w[1:k])/sum(w)
It is possible to sample from categorical distribution parametrized
by vector of unnormalized log-probabilities
α[1],...,α[m]
without leaving the log space by employing the Gumbel-max trick (Maddison, Tarlow and Minka, 2014).
If g[1],...,g[m] are samples from Gumbel distribution with
cumulative distribution function F(g) = exp(-exp(-g)),
then k = argmax(g[i]+α[i])
is a draw from categorical distribution parametrized by
vector of probabilities p[1]....,p[m], such that
p[i] = exp(α[i])/sum(exp(α)).
This is implemented in rcatlp
function parametrized by vector of
log-probabilities log_prob
.
Maddison, C. J., Tarlow, D., & Minka, T. (2014). A* sampling. [In:] Advances in Neural Information Processing Systems (pp. 3086-3094). https://arxiv.org/abs/1411.0030
# Generating 10 random draws from categorical distribution # with k=3 categories occuring with equal probabilities # parametrized using a vector rcat(10, c(1/3, 1/3, 1/3)) # or with k=5 categories parametrized using a matrix of probabilities # (generated from Dirichlet distribution) p <- rdirichlet(10, c(1, 1, 1, 1, 1)) rcat(10, p) x <- rcat(1e5, c(0.2, 0.4, 0.3, 0.1)) plot(prop.table(table(x)), type = "h") lines(0:5, dcat(0:5, c(0.2, 0.4, 0.3, 0.1)), col = "red") p <- rdirichlet(1, rep(1, 20)) x <- rcat(1e5, matrix(rep(p, 2), nrow = 2, byrow = TRUE)) xx <- 0:21 plot(prop.table(table(x))) lines(xx, dcat(xx, p), col = "red") xx <- seq(0, 21, by = 0.01) plot(ecdf(x)) lines(xx, pcat(xx, p), col = "red", lwd = 2) pp <- seq(0, 1, by = 0.001) plot(ecdf(x)) lines(qcat(pp, p), pp, col = "red", lwd = 2)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.