Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ldaformat2dtm

Transform data from and for use with the lda package


Description

Data from the lda package is transformed to a document-term matrix. This data format can be used to fit topic models using package topicmodels.

Data in form of a document-term matrix is transformed to the LDA format used by package lda.

Usage

ldaformat2dtm(documents, vocab, omit_empty = TRUE)
dtm2ldaformat(x, omit_empty = TRUE)

Arguments

documents

A list where each entry corresponds to a document; for each document the number of terms occurring in the document are stored in a matrix with two rows such that in each column the first entry corresponds to the vocabulary id of the term and the second entry to the number of times this term occurred in the document.

vocab

A "character" vector of the terms in the vocabulary.

x

An object of class "DocumentTermMatrix" as defined in package tm.

omit_empty

A logical indicating if empty documents should be removed when converting the objects. By default empty documents are removed.

Value

An object of class "DocumentTermMatrix" is returned by ldaformat2dtm() and a list with components "documents" and "vocab" by dtm2ldaformat().

Author(s)

Bettina Gruen

Examples

if (require("lda")) {
  data("cora.documents", package = "lda")
  data("cora.vocab", package = "lda")
  dtm <- ldaformat2dtm(cora.documents, cora.vocab)
  cora <- dtm2ldaformat(dtm)
  all.equal(cora, list(documents = cora.documents,
                       vocab = cora.vocab))
}

topicmodels

Topic Models

v0.2-12
GPL-2
Authors
Bettina Grün [aut, cre] (<https://orcid.org/0000-0001-7265-4773>), Kurt Hornik [aut] (<https://orcid.org/0000-0003-4198-9911>), David M Blei [ctb, cph] (VEM estimation of LDA and CTM), John D Lafferty [ctb, cph] (VEM estimation of CTM), Xuan-Hieu Phan [ctb, cph] (MCMC estimation of LDA), Makoto Matsumoto [ctb, cph] (Mersenne Twister RNG), Takuji Nishimura [ctb, cph] (Mersenne Twister RNG), Shawn Cokus [ctb] (Mersenne Twister RNG)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.