Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

stopwords

Collection of stopwords in multiple languages


Description

This function returns character vectors of stopwords for different languages, using the ISO-639-1 language codes, and allows for different sources of stopwords to be defined.

The default source is the Snowball() stopwords collection but other() sources are also available.

Usage

stopwords(language = "en", source = "snowball", simplify = TRUE)

Arguments

language

specify language of stopwords by ISO 639-1 code

source

specify a stopwords source. To list the currently available options, use stopwords_getsources().

simplify

logical; if TRUE return a simple vector, if FALSE return a list if the original word list was nested

Details

The language codes for each stopword list use the two-letter ISO code from https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes. For backwards compatibility, the full English names of the stopwords from the quanteda package may also be used, although these are deprecated.

Value

a character vector containing the stopwords, or a list of characters simplify = FALSE

Examples

stopwords("en")
stopwords("de")

stopwords

Multilingual Stopword Lists

v2.2
MIT + file LICENSE
Authors
Kenneth Benoit [aut, cre], David Muhr [aut], Kohei Watanabe [aut]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.