BayesMallows: plot_elbow – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

plot_elbow

Plot Within-Cluster Sum of Distances

Description

Plot the within-cluster sum of distances from the corresponding cluster consensus for different number of clusters. This function is useful for selecting the number of mixture.

Usage

plot_elbow(..., burnin = NULL)

Arguments

`...`	One or more objects returned from `compute_mallows`, separated by comma, or a list of such objects. Typically, each object has been run with a different number of mixtures, as specified in the `n_clusters` argument to `compute_mallows`.
`burnin`	The number of iterations to discard as burnin. Either a vector of numbers, one for each model, or a single number which is taken to be the burnin for all models. If each model provided has a `burnin` element, then this is taken as the default.

Value

A boxplot with the number of clusters on the horizontal axis and the with-cluster sum of distances on the vertical axis.

Examples

# DETERMINING THE NUMBER OF CLUSTERS IN THE SUSHI EXAMPLE DATA
## Not run: 
  # Let us look at any number of clusters from 1 to 10
  # We use the convenience function compute_mallows_mixtures
  n_clusters <- seq(from = 1, to = 10)
  models <- compute_mallows_mixtures(n_clusters = n_clusters,
                                     rankings = sushi_rankings,
                                     include_wcd = TRUE)
  # models is a list in which each element is an object of class BayesMallows,
  # returned from compute_mallows
  # We can create an elbow plot
  plot_elbow(models, burnin = 1000)
  # We then select the number of cluster at a point where this plot has
  # an "elbow", e.g., n_clusters = 5.

  # Having chosen the number of clusters, we can now study the final model
  # Rerun with 5 clusters, now setting save_clus = TRUE to get cluster assignments
  mixture_model <- compute_mallows(rankings = sushi_rankings, n_clusters = 5,
                                   include_wcd = TRUE, save_clus = TRUE)
  # Delete the models object to free some memory
  rm(models)
  # Set the burnin
  mixture_model$burnin <- 1000
  # Plot the posterior distributions of alpha per cluster
  plot(mixture_model)
  # Compute the posterior interval of alpha per cluster
  compute_posterior_intervals(mixture_model,
                              parameter = "alpha")
  # Plot the posterior distributions of cluster probabilities
  plot(mixture_model, parameter = "cluster_probs")
  # Plot the posterior probability of cluster assignment
  plot(mixture_model, parameter = "cluster_assignment")
  # Plot the posterior distribution of "tuna roll" in each cluster
  plot(mixture_model, parameter = "rho", items = "tuna roll")
  # Compute the cluster-wise CP consensus, and show one column per cluster
  cp <- compute_consensus(mixture_model, type = "CP")
  library(dplyr)
  library(tidyr)
  cp %>%
    select(-cumprob) %>%
    spread(key = cluster, value = item)
  # Compute the MAP consensus, and show one column per cluster
  map <- compute_consensus(mixture_model, type = "MAP")
  map %>%
    select(-probability) %>%
    spread(key = cluster, value = item)

  # RUNNING IN PARALLEL
  # Computing Mallows models with different number of mixtures in parallel leads to
  # considerably speedup
  library(parallel)
  cl <- makeCluster(detectCores() - 1)
  n_clusters <- seq(from = 1, to = 10)
  models <- compute_mallows_mixtures(n_clusters = n_clusters,
                                     rankings = sushi_rankings,
                                     include_wcd = TRUE, cl = cl)
  stopCluster(cl)

## End(Not run)

BayesMallows

Bayesian Preference Learning with the Mallows Rank Model

v1.0.1

GPL-3

Authors

Oystein Sorensen [aut, cre] (<https://orcid.org/0000-0003-0724-3542>), Valeria Vitelli [aut] (<https://orcid.org/0000-0002-6746-0453>), Marta Crispino [aut], Qinghua Liu [aut], Cristina Mollica [aut], Luca Tardella [aut]

Initial release