Remove word classes
This method strips off defined word classes of tagged text objects.
filterByClass(txt, ...) ## S4 method for signature 'kRp.text' filterByClass( txt, corp.rm.class = "nonpunct", corp.rm.tag = c(), as.vector = FALSE, update.desc = TRUE )
txt |
An object of class |
... |
Additional options, currently unused. |
corp.rm.class |
A character vector with word classes which should be removed. The default value
|
corp.rm.tag |
A character vector with valid POS tags which should be removed. |
as.vector |
Logical. If |
update.desc |
Logical. If |
An object of the input class. If as.vector=TRUE, returns only a character vector.
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
sample_file <- file.path(
path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
)
tokenized.obj <- tokenize(
txt=sample_file,
lang="en"
)
filterByClass(tokenized.obj)
} else {}Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.