Term-Document Matrix
Constructs or coerces to a term-document matrix or a document-term matrix.
TermDocumentMatrix(x, control = list()) DocumentTermMatrix(x, control = list()) as.TermDocumentMatrix(x, ...) as.DocumentTermMatrix(x, ...)
x |
a corpus for the constructors and either a term-document matrix or a document-term matrix or a simple triplet matrix (package slam) or a term frequency vector for the coercing functions. |
control |
a named list of control options. There are local
options which are evaluated for each document and global options
which are evaluated once for the constructed matrix. Available local
options are documented in This is different for a Available global options are:
|
... |
the additional argument |
An object of class TermDocumentMatrix
or class
DocumentTermMatrix
(both inheriting from a
simple triplet matrix in package slam)
containing a sparse term-document matrix or document-term matrix. The
attribute weighting
contains the weighting applied to the
matrix.
termFreq
for available local control options.
data("crude") tdm <- TermDocumentMatrix(crude, control = list(removePunctuation = TRUE, stopwords = TRUE)) dtm <- DocumentTermMatrix(crude, control = list(weighting = function(x) weightTfIdf(x, normalize = FALSE), stopwords = TRUE)) inspect(tdm[202:205, 1:5]) inspect(tdm[c("price", "prices", "texas"), c("127", "144", "191", "194")]) inspect(dtm[1:5, 273:276]) s <- SimpleCorpus(VectorSource(unlist(lapply(crude, as.character)))) m <- TermDocumentMatrix(s, control = list(removeNumbers = TRUE, stopwords = TRUE, stemming = TRUE)) inspect(m[c("price", "texa"), c("127", "144", "191", "194")])
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.