quanteda: texts – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

quanteda

texts

Get or assign corpus texts [deprecated]

Description

This function is deprecated and will be removed in the next major release.

Use as.character.corpus() to turn a corpus into a simple named character vector.
Use corpus_group() instead of texts(x, groups = ...) to aggregate texts by a grouping variable.
Use [<- instead of texts()<- for replacing texts in a corpus object.

Usage

texts(x, groups = NULL, spacer = " ")

texts(x) <- value

Arguments

`x`	a corpus
`groups`	grouping variable for sampling, equal in length to the number of documents. This will be evaluated in the docvars data.frame, so that docvars may be referred to by name without quoting. This also changes previous behaviours for `groups`. See `news(Version >= "3.0", package = "quanteda")` for details.
`spacer`	when concatenating texts by using `groups`, this will be the spacing added between texts. (Default is two spaces.)
`value`	character vector of the new texts

Details

Get or replace the texts in a corpus, with grouping options. Works for plain character vectors too, if groups is a factor.

Value

For texts, a character vector of the texts in the corpus.

For texts <-, the corpus with the updated texts.

for texts <-, a corpus with the texts replaced by value

Note

The groups will be used for concatenating the texts based on shared values of groups, without any specified order of aggregation.

You are strongly encouraged as a good practice of text analysis workflow not to modify the substance of the texts in a corpus. Rather, this sort of processing is better performed through downstream operations. For instance, do not lowercase the texts in a corpus, or you will never be able to recover the original case. Rather, apply tokens_tolower() after applying tokens() to a corpus, or use the option tolower = TRUE in dfm().

quanteda

Quantitative Analysis of Textual Data

v3.0.0

GPL-3

Authors

Kenneth Benoit [cre, aut, cph] (<https://orcid.org/0000-0002-0797-564X>), Kohei Watanabe [aut] (<https://orcid.org/0000-0001-6519-5265>), Haiyan Wang [aut] (<https://orcid.org/0000-0003-4992-4311>), Paul Nulty [aut] (<https://orcid.org/0000-0002-7214-4666>), Adam Obeng [aut] (<https://orcid.org/0000-0002-2906-4775>), Stefan Müller [aut] (<https://orcid.org/0000-0002-6315-4125>), Akitaka Matsuo [aut] (<https://orcid.org/0000-0002-3323-6330>), William Lowe [aut] (<https://orcid.org/0000-0002-1549-6163>), Christian Müller [ctb], European Research Council [fnd] (ERC-2011-StG 283794-QUANTESS)

Initial release

texts

Description

Usage

Arguments

Details

Value

Note

quanteda

We don't support your browser anymore