Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

document_term_casters

Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or dfm


Description

This turns a "tidy" one-term-per-document-per-row data frame into a DocumentTermMatrix or TermDocumentMatrix from the tm package, or a dfm from the quanteda package. These functions support non-standard evaluation through the tidyeval framework. Groups are ignored.

Usage

cast_tdm(data, term, document, value, weighting = tm::weightTf, ...)

cast_dtm(data, document, term, value, weighting = tm::weightTf, ...)

cast_dfm(data, document, term, value, ...)

Arguments

data

Table with one-term-per-document-per-row

term

Column containing terms as string or symbol

document

Column containing document IDs as string or symbol

value

Column containing values as string or symbol

weighting

The weighting function for the DTM/TDM (default is term-frequency, effectively unweighted)

...

Extra arguments passed on to sparseMatrix

Details

The arguments term, document, and value are passed by expression and support quasiquotation; you can unquote strings and symbols.


tidytext

Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools

v0.3.1
MIT + file LICENSE
Authors
Gabriela De Queiroz [ctb], Colin Fay [ctb] (<https://orcid.org/0000-0001-7343-1846>), Emil Hvitfeldt [ctb], Os Keyes [ctb] (<https://orcid.org/0000-0001-5196-609X>), Kanishka Misra [ctb], Tim Mastny [ctb], Jeff Erickson [ctb], David Robinson [aut], Julia Silge [aut, cre] (<https://orcid.org/0000-0002-3671-836X>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.