Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

stopwords

Stopwords


Description

Return various kinds of stopwords with support for different languages.

Usage

stopwords(kind = "en")

Arguments

kind

A character string identifying the desired stopword list.

Details

Available stopword lists are:

catalan

Catalan stopwords (obtained from http://latel.upf.edu/morgana/altres/pub/ca_stop.htm),

romanian

Romanian stopwords (extracted from http://snowball.tartarus.org/otherapps/romanian/romanian1.tgz),

SMART

English stopwords from the SMART information retrieval system (as documented in Appendix 11 of https://jmlr.csail.mit.edu/papers/volume5/lewis04a/) (which coincides with the stopword list used by the MC toolkit (https://www.cs.utexas.edu/users/dml/software/mc/)),

and a set of stopword lists from the Snowball stemmer project in different languages (obtained from http://svn.tartarus.org/snowball/trunk/website/algorithms/*/stop.txt). Supported languages are danish, dutch, english, finnish, french, german, hungarian, italian, norwegian, portuguese, russian, spanish, and swedish. Language names are case sensitive. Alternatively, their IETF language tags may be used.

Value

A character vector containing the requested stopwords. An error is raised if no stopwords are available for the requested kind.

Examples

stopwords("en")
stopwords("SMART")
stopwords("german")

tm

Text Mining Package

v0.7-8
GPL-3
Authors
Ingo Feinerer [aut, cre] (<https://orcid.org/0000-0001-7656-8338>), Kurt Hornik [aut] (<https://orcid.org/0000-0003-4198-9911>), Artifex Software, Inc. [ctb, cph] (pdf_info.ps taken from GPL Ghostscript)
Initial release
2020-11-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.