Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

sw_jockers

Matthew Jocker's Expanded Topic Modeling Stopword List


Description

A dataset containing a character vector of Jocker's stopwords he used for topic modeling. He later resorted to eliminating everything but nouns: http://www.matthewjockers.net/2013/04/12/secret-recipe-for-topic-modeling-themes/.

Usage

data(sw_jockers)

Format

A character vector with 5,902 elements

References


lexicon

Lexicons for Text Analysis

v1.2.1
GPL-3
Authors
Tyler Rinker [aut, cre, cph], University of Notre Dame [dtc, cph], Department of Knowledge Technologies [dtc, cph], Unicode, Inc. [dtc, cph], John Higgins [dtc, cph], Grady Ward [dtc], Heiko Possel [dtc], Michal Boleslav Mechura [dtc, cph], Bing Liu [dtc], Minqing Hu [dtc], Saif M. Mohammad [dtc], Peter Turney [dtc], Erik Cambria [dtc], Soujanya Poria [dtc], Rajiv Bajpai [dtc], Bjoern Schuller [dtc], SentiWordNet [dtc, cph], Liang Wu [dtc, cph], Fred Morstatter [dtc, cph], Huan Liu [dtc, cph], Grammar Revolution [dtc, cph], Vidar Holen [dtc, cph], Alejandro U. Alvarez [dtc, cph], Stackoverflow User user2592414 [dtc, cph], BannedWordList.com [dtc, cph], Apache Software Foundation [dtc, cph], Andrew Kachites McCallum [dtc, cph], Alireza Savand [dtc, cph], Zact Anger [dtc, cph], Titus Wormer [dtc, cph], Colin Martindale [dtc, cph], John Wiseman [dtc, cph], Nadra Pencle [dtc, cph], Irina Malaescu [dtc, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.