Label topics
Generate a set of words describing each topic from a fitted STM object. Uses a variety of labeling algorithms (see details).
labelTopics(model, topics = NULL, n = 7, frexweight = 0.5)
model |
An |
topics |
A vector of numbers indicating the topics to include. Default is all topics. |
n |
The desired number of words (per type) used to label each topic. Must be 1 or greater. |
frexweight |
A weight used in our approximate FREX scoring algorithm (see details). |
Four different types of word weightings are printed with label topics.
Highest Prob: are the words within each topic with the highest probability (inferred directly from topic-word distribution parameter β).
FREX: are the words that are both frequent and exclusive, identifying words
that distinguish topics. This is calculated by taking the harmonic mean of
rank by probability within the topic (frequency) and rank by distribution of
topic given word p(z|w=v) (exclusivity). In estimating exclusivity we
use a James-Stein type shrinkage estimator of the distribution
p(z|w=v). More information can be found in the documentation for the
internal function calcfrex
and js.estimate
.
A labelTopics object (list)
prob |
matrix of highest probability words |
frex |
matrix of highest ranking frex words |
lift |
matrix of highest scoring words by lift |
score |
matrix of best words by score |
topicnums |
a vector of topic numbers which correspond to the rows |
labelTopics(gadarianFit)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.