Preference Learning with the Mallows Rank Model
Compute the posterior distributions of the parameters of the Bayesian Mallows Rank Model, given rankings or preferences stated by a set of assessors.
The BayesMallows
package uses the following parametrization of the
Mallows rank model (Mallows 1957):
p(r|α,ρ) = (1/Z_{n}(α)) \exp{-α/n d(r,ρ)}
where
r is a ranking, α is a scale parameter, ρ is the
latent consensus ranking, Z_{n}(α) is the partition function
(normalizing constant), and d(r,ρ) is a distance function
measuring the distance between r and ρ. Note that some
authors use a Mallows model without division by n in the exponent;
this includes the PerMallows
package, whose scale parameter
θ corresponds to α/n in the BayesMallows
package. We refer to (Vitelli et al. 2018) for further
details of the Bayesian Mallows model.
compute_mallows
always returns posterior distributions of the latent
consensus ranking ρ and the scale parameter α. Several
distance measures are supported, and the preferences can take the form of
complete or incomplete rankings, as well as pairwise preferences.
compute_mallows
can also compute mixtures of Mallows models, for
clustering of assessors with similar preferences.
compute_mallows( rankings = NULL, preferences = NULL, obs_freq = NULL, metric = "footrule", error_model = NULL, n_clusters = 1L, save_clus = FALSE, clus_thin = 1L, nmc = 2000L, leap_size = max(1L, floor(n_items/5)), swap_leap = 1L, rho_init = NULL, rho_thinning = 1L, alpha_prop_sd = 0.1, alpha_init = 1, alpha_jump = 1L, lambda = 0.001, alpha_max = 1e+06, psi = 10L, include_wcd = (n_clusters > 1), save_aug = FALSE, aug_thinning = 1L, logz_estimate = NULL, verbose = FALSE, validate_rankings = TRUE, na_action = "augment", constraints = NULL, save_ind_clus = FALSE, seed = NULL )
rankings |
A matrix of ranked items, of size |
preferences |
A dataframe with pairwise comparisons, with 3 columns,
named |
obs_freq |
A vector of observation frequencies (weights) to apply do each row in |
metric |
A character string specifying the distance metric to use in the
Bayesian Mallows Model. Available options are |
error_model |
Character string specifying which model to use for
inconsistent rankings. Defaults to |
n_clusters |
Integer specifying the number of clusters, i.e., the number
of mixture components to use. Defaults to |
save_clus |
Logical specifying whether or not to save cluster
assignments. Defaults to |
clus_thin |
Integer specifying the thinning to be applied to cluster
assignments and cluster probabilities. Defaults to |
nmc |
Integer specifying the number of iteration of the
Metropolis-Hastings algorithm to run. Defaults to |
leap_size |
Integer specifying the step size of the leap-and-shift
proposal distribution. Defaults |
swap_leap |
Integer specifying the step size of the Swap proposal. Only
used when |
rho_init |
Numeric vector specifying the initial value of the latent
consensus ranking ρ. Defaults to NULL, which means that the
initial value is set randomly. If |
rho_thinning |
Integer specifying the thinning of |
alpha_prop_sd |
Numeric value specifying the standard deviation of the
lognormal proposal distribution used for α in the
Metropolis-Hastings algorithm. Defaults to |
alpha_init |
Numeric value specifying the initial value of the scale
parameter α. Defaults to |
alpha_jump |
Integer specifying how many times to sample ρ
between each sampling of α. In other words, how many times to
jump over α while sampling ρ, and possibly other
parameters like augmented ranks \tilde{R} or cluster assignments
z. Setting |
lambda |
Strictly positive numeric value specifying the rate parameter
of the truncated exponential prior distribution of α. Defaults
to |
alpha_max |
Maximum value of |
psi |
Integer specifying the concentration parameter ψ of the
Dirichlet prior distribution used for the cluster probabilities
τ_{1}, τ_{2}, …, τ_{C}, where C is the value of
|
include_wcd |
Logical indicating whether to store the within-cluster
distances computed during the Metropolis-Hastings algorithm. Defaults to
|
save_aug |
Logical specifying whether or not to save the augmented
rankings every |
aug_thinning |
Integer specifying the thinning for saving augmented
data. Only used when |
logz_estimate |
Estimate of the partition function, computed with
|
verbose |
Logical specifying whether to print out the progress of the
Metropolis-Hastings algorithm. If |
validate_rankings |
Logical specifying whether the rankings provided (or
generated from |
na_action |
Character specifying how to deal with |
constraints |
Optional constraint set returned from
|
save_ind_clus |
Whether or not to save the individual cluster
probabilities in each step. This results in csv files
|
seed |
Optional integer to be used as random number seed. |
A list of class BayesMallows.
Crispino M, Arjas E, Vitelli V, Barrett N, Frigessi A (2019).
“A Bayesian Mallows approach to nontransitive pair comparison data: How human are sounds?”
The Annals of Applied Statistics, 13(1), 492–519.
doi: 10.1214/18-aoas1203, https://doi.org/10.1214/18-aoas1203.
Mallows CL (1957).
“Non-Null Ranking Models. I.”
Biometrika, 44(1/2), 114–130.
Vitelli V, Sørensen Ø, Crispino M, Arjas E, Frigessi A (2018).
“Probabilistic Preference Learning with the Mallows Rank Model.”
Journal of Machine Learning Research, 18(1), 1–49.
https://jmlr.org/papers/v18/15-481.html.
compute_mallows_mixtures
for a function that computes
separate Mallows models for varying numbers of clusters.
# ANALYSIS OF COMPLETE RANKINGS # The example datasets potato_visual and potato_weighing contain complete # rankings of 20 items, by 12 assessors. We first analyse these using the Mallows # model: model_fit <- compute_mallows(potato_visual) # We study the trace plot of the parameters assess_convergence(model_fit, parameter = "alpha") ## Not run: assess_convergence(model_fit, parameter = "rho") # Based on these plots, we set burnin = 1000. model_fit$burnin <- 1000 # Next, we use the generic plot function to study the posterior distributions # of alpha and rho plot(model_fit, parameter = "alpha") ## Not run: plot(model_fit, parameter = "rho", items = 10:15) # We can also compute the CP consensus posterior ranking compute_consensus(model_fit, type = "CP") # And we can compute the posterior intervals: # First we compute the interval for alpha compute_posterior_intervals(model_fit, parameter = "alpha") # Then we compute the interval for all the items ## Not run: compute_posterior_intervals(model_fit, parameter = "rho") # ANALYSIS OF PAIRWISE PREFERENCES ## Not run: # The example dataset beach_preferences contains pairwise # preferences between beaches stated by 60 assessors. There # is a total of 15 beaches in the dataset. # In order to use it, we first generate all the orderings # implied by the pairwise preferences. beach_tc <- generate_transitive_closure(beach_preferences) # We also generate an inital rankings beach_rankings <- generate_initial_ranking(beach_tc, n_items = 15) # We then run the Bayesian Mallows rank model # We save the augmented data for diagnostics purposes. model_fit <- compute_mallows(rankings = beach_rankings, preferences = beach_tc, save_aug = TRUE, verbose = TRUE) # We can assess the convergence of the scale parameter assess_convergence(model_fit) # We can assess the convergence of latent rankings. Here we # show beaches 1-5. assess_convergence(model_fit, parameter = "rho", items = 1:5) # We can also look at the convergence of the augmented rankings for # each assessor. assess_convergence(model_fit, parameter = "Rtilde", items = c(2, 4), assessors = c(1, 2)) # Notice how, for assessor 1, the lines cross each other, while # beach 2 consistently has a higher rank value (lower preference) for # assessor 2. We can see why by looking at the implied orderings in # beach_tc library(dplyr) beach_tc %>% filter(assessor %in% c(1, 2), bottom_item %in% c(2, 4) & top_item %in% c(2, 4)) # Assessor 1 has no implied ordering between beach 2 and beach 4, # while assessor 2 has the implied ordering that beach 4 is preferred # to beach 2. This is reflected in the trace plots. ## End(Not run) # CLUSTERING OF ASSESSORS WITH SIMILAR PREFERENCES ## Not run: # The example dataset sushi_rankings contains 5000 complete # rankings of 10 types of sushi # We start with computing a 3-cluster solution, and save # cluster assignments by setting save_clus = TRUE model_fit <- compute_mallows(sushi_rankings, n_clusters = 3, nmc = 10000, save_clus = TRUE, verbose = TRUE) # We then assess convergence of the scale parameter alpha assess_convergence(model_fit) # Next, we assess convergence of the cluster probabilities assess_convergence(model_fit, parameter = "cluster_probs") # Based on this, we set burnin = 1000 # We now plot the posterior density of the scale parameters alpha in # each mixture: model_fit$burnin <- 1000 plot(model_fit, parameter = "alpha") # We can also compute the posterior density of the cluster probabilities plot(model_fit, parameter = "cluster_probs") # We can also plot the posterior cluster assignment. In this case, # the assessors are sorted according to their maximum a posteriori cluster estimate. plot(model_fit, parameter = "cluster_assignment") # We can also assign each assessor to a cluster cluster_assignments <- assign_cluster(model_fit, soft = FALSE) ## End(Not run) # DETERMINING THE NUMBER OF CLUSTERS ## Not run: # Continuing with the sushi data, we can determine the number of cluster # Let us look at any number of clusters from 1 to 10 # We use the convenience function compute_mallows_mixtures n_clusters <- seq(from = 1, to = 10) models <- compute_mallows_mixtures(n_clusters = n_clusters, rankings = sushi_rankings, nmc = 6000, alpha_jump = 10, include_wcd = TRUE) # models is a list in which each element is an object of class BayesMallows, # returned from compute_mallows # We can create an elbow plot plot_elbow(models, burnin = 1000) # We then select the number of cluster at a point where this plot has # an "elbow", e.g., at 6 clusters. ## End(Not run) # SPEEDING UP COMPUTION WITH OBSERVATION FREQUENCIES # With a large number of assessors taking on a relatively low number of unique rankings, # the obs_freq argument allows providing a rankings matrix with the unique set of rankings, # and the obs_freq vector giving the number of assessors with each ranking. # This is illustrated here for the potato_visual dataset # # assume each row of potato_visual corresponds to between 1 and 5 assessors, as # given by the obs_freq vector set.seed(1234) obs_freq <- sample.int(n = 5, size = nrow(potato_visual), replace = TRUE) m <- compute_mallows(rankings = potato_visual, obs_freq = obs_freq) # See the separate help page for more examples, with the following code help("obs_freq")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.