Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

dtm_sample

Random samples and permutations from a Document-Term-Matrix


Description

Sample the specified number of rows from the Document-Term-Matrix using either with or without replacement.

Usage

dtm_sample(dtm, size = nrow(dtm), replace = FALSE, prob = NULL)

Arguments

dtm

a document term matrix of class dgCMatrix (which can be an object returned by document_term_matrix)

size

a positive number, the number of rows to sample

replace

should sampling be with replacement

prob

a vector of probability weights, one for each row of x

Value

dtm with as many rows as specified in size

Examples

x <- list(doc1 = c("aa", "bb", "cc", "aa", "b"), 
          doc2 = c("bb", "bb", "dd", ""), 
          doc3 = character(),
          doc4 = c("cc", NA), 
          doc5 = character())
dtm <- document_term_matrix(x)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 3)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 8, replace = TRUE)
dtm_sample(dtm, size = 8, replace = TRUE, prob = c(1, 1, 0.01, 0.5, 0.01))

udpipe

Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

v0.8.5
MPL-2.0
Authors
Jan Wijffels [aut, cre, cph], BNOSAC [cph], Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic [cph], Milan Straka [ctb, cph], Jana Straková [ctb, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.