An R wrapper for the Mallet topic modeling package
This package provides an interface to the Java implementation of latent Dirichlet allocation in the Mallet machine learning package. Mallet has many functions, this wrapper focuses on the topic modeling sub-package written by David Mimno. The package uses the rJava
package to connect to a JVM.
Package: | mallet |
Type: | Package |
Version: | 1.0 |
Date: | 2013-08-08 |
License: | MIT |
Create a topic model trainer: MalletLDA
Load documents from disk and import them:
mallet.read.dir
mallet.import
Get info about word frequencies: mallet.word.freqs
Get trained model parameters:
mallet.doc.topics
mallet.topic.words
mallet.subset.topic.words
Reports on topic words:
mallet.top.words
mallet.topic.labels
Clustering of topics: mallet.topic.hclust
Maintainer: David Mimno
The model, Latent Dirichlet allocation (LDA): David M Blei, Andrew Ng, Michael Jordan. Latent Dirichlet Allocation. J. of Machine Learning Research, 2003.
The Java toolkit: Andrew Kachites McCallum. The Mallet Toolkit. 2002.
Details of the fast sparse Gibbs sampling algorithm: Limin Yao, David Mimno, Andrew McCallum. Streaming Inference for Latent Dirichlet Allocation. KDD, 2009.
Hyperparameter optimization: Hanna Wallach, David Mimno, Andrew McCallum. Rethinking LDA: Why Priors Matter. NIPS, 2010.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.