Create a document-feature matrix
dfm(
x,
tolower = TRUE,
remove_padding = FALSE,
verbose = quanteda_options("verbose"),
...
)x |
|
tolower |
convert all features to lowercase |
remove_padding |
logical; if |
verbose |
display messages if |
... |
not used directly |
a dfm object
In quanteda v3, many convenience functions formerly available in
dfm() were deprecated. Formerly, dfm() could be called directly on a
character or corpus object, but we now steer users to tokenise their
inputs first using tokens(). Other convenience arguments to dfm() were
also removed, such as select, dictionary, thesaurus, and groups. All
of these functions are available elsewhere, e.g. through dfm_group().
See news(Version >= "2.9", package = "quanteda") for details.
## for a corpus
toks <- data_corpus_inaugural %>%
corpus_subset(Year > 1980) %>%
tokens()
dfm(toks)
# removal options
toks <- tokens(c("a b c", "A B C D")) %>%
tokens_remove("b", padding = TRUE)
toks
dfm(toks)
dfm(toks, remove = "") # remove "pads"
# preserving case
dfm(toks, tolower = FALSE)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.