Convert quanteda objects to non-quanteda formats
Convert a quanteda dfm or corpus object to a format useable by other
packages. The general function convert
provides easy conversion from a dfm
to the document-term representations used in all other text analysis packages
for which conversions are defined. For corpus objects, convert
provides
an easy way to make a corpus and its document variables into a data.frame.
convert(x, to, ...) ## S3 method for class 'dfm' convert( x, to = c("lda", "tm", "stm", "austin", "topicmodels", "lsa", "matrix", "data.frame", "tripletlist"), docvars = NULL, omit_empty = TRUE, docid_field = "doc_id", ... ) ## S3 method for class 'corpus' convert(x, to = c("data.frame", "json"), pretty = FALSE, ...)
x |
|
to |
target conversion format, one of:
|
... |
unused directly |
docvars |
optional data.frame of document variables used as the
|
omit_empty |
logical; if |
docid_field |
character; the name of the column containing document
names used when |
pretty |
adds indentation whitespace to JSON output. Can be TRUE/FALSE or a number specifying the number of spaces to indent. See |
A converted object determined by the value of to
(see above).
See conversion target package documentation for more detailed descriptions
of the return formats.
## convert a dfm toks <- corpus_subset(data_corpus_inaugural, Year > 1970) %>% tokens() dfmat1 <- dfm(toks) # austin's wfm format identical(dim(dfmat1), dim(convert(dfmat1, to = "austin"))) # stm package format stmmat <- convert(dfmat1, to = "stm") str(stmmat) # triplet tripletmat <- convert(dfmat1, to = "tripletlist") str(tripletmat) ## Not run: # tm's DocumentTermMatrix format tmdfm <- convert(dfmat1, to = "tm") str(tmdfm) # topicmodels package format str(convert(dfmat1, to = "topicmodels")) # lda package format str(convert(dfmat1, to = "lda")) ## End(Not run) ## convert a corpus into a data.frame corp <- corpus(c(d1 = "Text one.", d2 = "Text two."), docvars = data.frame(dvar1 = 1:2, dvar2 = c("one", "two"), stringsAsFactors = FALSE)) convert(corp, to = "data.frame") convert(corp, to = "json")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.