detect noise
detect noise
noise(.Object, ...) ## S4 method for signature 'DocumentTermMatrix' noise( .Object, minTotal = 2, minTfIdfMean = 0.005, sparse = 0.995, stopwordsLanguage = "german", minNchar = 2, specialChars = getOption("polmineR.specialChars"), numbers = "^[0-9\\.,]+$", verbose = TRUE ) ## S4 method for signature 'TermDocumentMatrix' noise(.Object, ...) ## S4 method for signature 'character' noise( .Object, stopwordsLanguage = "german", minNchar = 2, specialChars = getOption("polmineR.specialChars"), numbers = "^[0-9\\.,]+$", verbose = TRUE ) ## S4 method for signature 'textstat' noise(.Object, p_attribute, ...)
.Object |
an .Object of class |
... |
further parameters |
minTotal |
minimum colsum (for DocumentTermMatrix) to qualify a term as non-noise |
minTfIdfMean |
minimum mean value for tf-idf to qualify a term as non-noise |
sparse |
will be passed into |
stopwordsLanguage |
e.g. "german", to get stopwords defined in the tm package |
minNchar |
min char length ti qualify a term as non-noise |
specialChars |
special characters to drop |
numbers |
regex, to drop numbers |
verbose |
logical |
p_attribute |
relevant if applied to a textstat object |
a list
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.