Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

readTagged

Read In a POS-Tagged Word Text Document


Description

Return a function which reads in a text document containing POS-tagged words.

Usage

readTagged(...)

Arguments

...

Arguments passed to TaggedTextDocument.

Details

Formally this function is a function generator, i.e., it returns a function (which reads in a text document) with a well-defined signature, but can access passed over arguments (...) via lexical scoping.

Value

A function with the following formals:

elem

a named list with the component content which must hold the document to be read in or the component uri holding a connection object or a character string.

language

a string giving the language.

id

a character giving a unique identifier for the created text document.

The function returns a TaggedTextDocument representing the text and metadata extracted from elem$content or elem$uri. The argument id is used as fallback if elem$uri is null.

See Also

Reader for basic information on the reader infrastructure employed by package tm.

Examples

# See http://www.nltk.org/book/ch05.html or file ca01 in the Brown corpus
x <- paste("The/at grand/jj jury/nn commented/vbd on/in a/at number/nn of/in",
           "other/ap topics/nns ,/, among/in them/ppo the/at Atlanta/np and/cc",
           "Fulton/np-tl County/nn-tl purchasing/vbg departments/nns which/wdt",
           "it/pps said/vbd ``/`` are/ber well/ql operated/vbn and/cc follow/vb",
           "generally/rb accepted/vbn practices/nns which/wdt inure/vb to/in the/at",
           "best/jjt interest/nn of/in both/abx governments/nns ''/'' ./.")
vs <- VectorSource(x)
elem <- getElem(stepNext(vs))
(doc <- readTagged()(elem, language = "en", id = "id1"))
tagged_words(doc)

tm

Text Mining Package

v0.7-8
GPL-3
Authors
Ingo Feinerer [aut, cre] (<https://orcid.org/0000-0001-7656-8338>), Kurt Hornik [aut] (<https://orcid.org/0000-0003-4198-9911>), Artifex Software, Inc. [ctb, cph] (pdf_info.ps taken from GPL Ghostscript)
Initial release
2020-11-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.