Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

readReut21578XML

Read In a Reuters-21578 XML Document


Description

Read in a Reuters-21578 XML document.

Usage

readReut21578XML(elem, language, id)
readReut21578XMLasPlain(elem, language, id)

Arguments

elem

a named list with the component content which must hold the document to be read in.

language

a string giving the language.

id

Not used.

Value

An XMLTextDocument for readReut21578XML, or a PlainTextDocument for readReut21578XMLasPlain, representing the text and metadata extracted from elem$content.

References

Emms, Martin and Luz, Saturnino (2007). Machine Learning for Natural Language Processing. European Summer School of Logic, Language and Information, course reader. http://www.homepages.ed.ac.uk/sluzfil/esslli07/mlfornlp.pdf

Lewis, David (1997) Reuters-21578 Text Categorization Collection Distribution 1.0. http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html

Luz, Saturnino XML-encoded version of Reuters-21578. http://www.homepages.ed.ac.uk/sluzfil/esslli07/data/reuters21578-xml.tar.bz2

See Also

Reader for basic information on the reader infrastructure employed by package tm.


tm

Text Mining Package

v0.7-8
GPL-3
Authors
Ingo Feinerer [aut, cre] (<https://orcid.org/0000-0001-7656-8338>), Kurt Hornik [aut] (<https://orcid.org/0000-0003-4198-9911>), Artifex Software, Inc. [ctb, cph] (pdf_info.ps taken from GPL Ghostscript)
Initial release
2020-11-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.