Create a document-feature matrix
dfm( x, tolower = TRUE, remove_padding = FALSE, verbose = quanteda_options("verbose"), ... )
x |
|
tolower |
convert all features to lowercase |
remove_padding |
logical; if |
verbose |
display messages if |
... |
not used directly |
a dfm object
In quanteda v3, many convenience functions formerly available in
dfm()
were deprecated. Formerly, dfm()
could be called directly on a
character
or corpus
object, but we now steer users to tokenise their
inputs first using tokens()
. Other convenience arguments to dfm()
were
also removed, such as select
, dictionary
, thesaurus
, and groups
. All
of these functions are available elsewhere, e.g. through dfm_group()
.
See news(Version >= "2.9", package = "quanteda")
for details.
## for a corpus toks <- data_corpus_inaugural %>% corpus_subset(Year > 1980) %>% tokens() dfm(toks) # removal options toks <- tokens(c("a b c", "A B C D")) %>% tokens_remove("b", padding = TRUE) toks dfm(toks) dfm(toks, remove = "") # remove "pads" # preserving case dfm(toks, tolower = FALSE)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.