PPC distributions
Compare the empirical distribution of the data y to the distributions
of simulated/replicated data yrep from the posterior predictive
distribution. See the Plot Descriptions section, below,
for details.
ppc_data(y, yrep, group = NULL)
ppc_hist(y, yrep, ..., binwidth = NULL, breaks = NULL, freq = TRUE)
ppc_boxplot(y, yrep, ..., notch = TRUE, size = 0.5, alpha = 1)
ppc_freqpoly(
y,
yrep,
...,
binwidth = NULL,
freq = TRUE,
size = 0.25,
alpha = 1
)
ppc_freqpoly_grouped(
y,
yrep,
group,
...,
binwidth = NULL,
freq = TRUE,
size = 0.25,
alpha = 1
)
ppc_dens(y, yrep, ..., trim = FALSE, size = 0.5, alpha = 1)
ppc_dens_overlay(
y,
yrep,
...,
size = 0.25,
alpha = 0.7,
trim = FALSE,
bw = "nrd0",
adjust = 1,
kernel = "gaussian",
n_dens = 1024
)
ppc_dens_overlay_grouped(
y,
yrep,
group,
...,
size = 0.25,
alpha = 0.7,
trim = FALSE,
bw = "nrd0",
adjust = 1,
kernel = "gaussian",
n_dens = 1024
)
ppc_ecdf_overlay(
y,
yrep,
...,
discrete = FALSE,
pad = TRUE,
size = 0.25,
alpha = 0.7
)
ppc_ecdf_overlay_grouped(
y,
yrep,
group,
...,
discrete = FALSE,
pad = TRUE,
size = 0.25,
alpha = 0.7
)
ppc_violin_grouped(
y,
yrep,
group,
...,
probs = c(0.1, 0.5, 0.9),
size = 1,
alpha = 1,
y_draw = c("violin", "points", "both"),
y_size = 1,
y_alpha = 1,
y_jitter = 0.1
)y |
A vector of observations. See Details. |
yrep |
An S by N matrix of draws from the posterior
predictive distribution, where S is the size of the posterior sample
(or subset of the posterior sample used to generate |
group |
A grouping variable (a vector or factor) the same length as
|
... |
Currently unused. |
binwidth |
Passed to |
breaks |
Passed to |
freq |
For histograms, |
notch |
A logical scalar passed to |
size, alpha |
Passed to the appropriate geom to control the appearance of
the |
trim |
A logical scalar passed to |
bw, adjust, kernel, n_dens |
Optional arguments passed to
|
discrete |
For |
pad |
A logical scalar passed to |
probs |
A numeric vector passed to |
y_draw |
For |
y_jitter, y_size, y_alpha |
For |
For Binomial data, the plots will typically be most useful if
y and yrep contain the "success" proportions (not discrete
"success" or "failure" counts).
The plotting functions return a ggplot object that can be further
customized using the ggplot2 package. The functions with suffix
_data() return the data that would have been drawn by the plotting
function.
ppc_hist(), ppc_freqpoly(), ppc_dens(), ppc_boxplot()A separate histogram, shaded frequency polygon, smoothed kernel density
estimate, or box and whiskers plot is displayed for y and each
dataset (row) in yrep. For these plots yrep should therefore
contain only a small number of rows. See the Examples section.
ppc_freqpoly_grouped()A separate frequency polygon is plotted for each level of a grouping
variable for y and each dataset (row) in yrep. For this plot
yrep should therefore contain only a small number of rows. See the
Examples section.
ppc_ecdf_overlay(), ppc_dens_overlay(), ppc_ecdf_overlay_grouped(), ppc_dens_overlay_grouped()Kernel density or empirical CDF estimates of each dataset (row) in
yrep are overlaid, with the distribution of y itself on top
(and in a darker shade). When using ppc_ecdf_overlay() with discrete
data, set the discrete argument to TRUE for better results.
For an example of ppc_dens_overlay() also see Gabry et al. (2019).
ppc_violin_grouped()The density estimate of yrep within each level of a grouping
variable is plotted as a violin with horizontal lines at notable
quantiles. y is overlaid on the plot either as a violin, points, or
both, depending on the y_draw argument.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389-402. doi:10.1111/rssa.12378. (journal version, arXiv preprint, code on GitHub)
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. (Ch. 6)
Other PPCs:
PPC-censoring,
PPC-discrete,
PPC-errors,
PPC-intervals,
PPC-loo,
PPC-overview,
PPC-scatterplots,
PPC-test-statistics
color_scheme_set("brightblue")
y <- example_y_data()
yrep <- example_yrep_draws()
dim(yrep)
ppc_dens_overlay(y, yrep[1:25, ])
# ppc_ecdf_overlay with continuous data (set discrete=TRUE if discrete data)
ppc_ecdf_overlay(y, yrep[sample(nrow(yrep), 25), ])
# for ppc_hist,dens,freqpoly,boxplot definitely use a subset yrep rows so
# only a few (instead of nrow(yrep)) histograms are plotted
ppc_hist(y, yrep[1:8, ])
color_scheme_set("red")
ppc_boxplot(y, yrep[1:8, ])
# wizard hat plot
color_scheme_set("blue")
ppc_dens(y, yrep[200:202, ])
ppc_freqpoly(y, yrep[1:3,], alpha = 0.1, size = 1, binwidth = 5)
# if groups are different sizes then the 'freq' argument can be useful
group <- example_group_data()
ppc_freqpoly_grouped(y, yrep[1:3,], group) + yaxis_text()
ppc_freqpoly_grouped(y, yrep[1:3,], group, freq = FALSE) + yaxis_text()
# density and distribution overlays by group
ppc_dens_overlay_grouped(y, yrep[1:25, ], group = group)
ppc_ecdf_overlay_grouped(y, yrep[1:25, ], group = group)
# don't need to only use small number of rows for ppc_violin_grouped
# (as it pools yrep draws within groups)
color_scheme_set("gray")
ppc_violin_grouped(y, yrep, group, size = 1.5)
ppc_violin_grouped(y, yrep, group, alpha = 0)
# change how y is drawn
ppc_violin_grouped(y, yrep, group, alpha = 0, y_draw = "points", y_size = 1.5)
ppc_violin_grouped(y, yrep, group, alpha = 0, y_draw = "both",
y_size = 1.5, y_alpha = 0.5, y_jitter = 0.33)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.