Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

rainette_stats

Generate cluster keyness statistics from a rainette result


Description

Generate cluster keyness statistics from a rainette result

Usage

rainette_stats(
  groups,
  dtm,
  measure = c("chi2", "lr"),
  n_terms = 15,
  show_negative = TRUE,
  max_p = 0.05
)

Arguments

groups

groups membership computed by cutree_rainette or cutree_rainette2

dtm

the dfm object used to compute the clustering

measure

statistics to compute

n_terms

number of terms to display in keyness plots

show_negative

if TRUE, show negative keyness features

max_p

maximum keyness statistic p-value

Value

A list with, for each group, a data.frame of keyness statistics for the most specific n_terms features.

See Also

Examples

library(quanteda)
corpus <- data_corpus_inaugural
corpus <- head(corpus, n = 10)
corpus <- split_segments(corpus)
dtm <- dfm(corpus, remove = stopwords("en"), tolower = TRUE, remove_punct = TRUE)
dtm <- dfm_trim(dtm, min_termfreq = 3)
res <- rainette(dtm, k = 3)
groups <- cutree_rainette(res, k = 3)
rainette_stats(groups, dtm)

rainette

The Reinert Method for Textual Data Clustering

v0.1.3
GPL (>= 3)
Authors
Julien Barnier [aut, cre], Florian Privé [ctb]
Initial release
2021-05-10

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.