quanteda: docnames – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

quanteda

docnames

Get or set document names

Description

Get or set the document names of a corpus, tokens, or dfm object.

Usage

docnames(x)

docnames(x) <- value

docid(x)

Arguments

`x`	the object with docnames
`value`	a character vector of the same length as `x`

Value

docnames returns a character vector of the document names

docnames <- assigns new values to the document names of an object. docnames can only be character, so any non-character value assigned to be a docname will be coerced to mode character.

docid returns an internal variable denoting the original "docname" from which a document came. Unless an object has been reshaped (e.g. corpus_reshape(), split (e.g.tokens_split()), or segmented (e.g. corpus_segment()), docid(x) will return the docnames.

Note

docid is designed primarily for developers, not for end users. In most cases, you will want docnames instead. It is, however, the default for groups, so that documents that have been previously reshaped (e.g. corpus_reshape(), split (e.g.tokens_split()), or segmented (e.g. corpus_segment()) will be regrouped into their original docnames when groups = docid(x).

Examples

# get and set doument names to a corpus
corp <- data_corpus_inaugural
docnames(corp) <- char_tolower(docnames(corp))

# get and set doument names to a tokens
toks <- tokens(data_corpus_inaugural)
docnames(toks) <- char_tolower(docnames(toks))

# get and set doument names to a dfm
dfmat <- dfm(data_corpus_inaugural[1:5])
docnames(dfmat) <- char_tolower(docnames(dfmat))

# reassign the document names of the inaugural speech corpus
docnames(data_corpus_inaugural) <- paste("Speech", 1:ndoc(data_corpus_inaugural), sep="")

# docid
corp <- corpus(c(textone = "This is a sentence.  Another sentence.  Yet another.",
                 textwo = "Sentence 1. Sentence 2."))
corpsent <- corp %>%
    corpus_reshape(to = "sentences")
docnames(corpsent)
docid(corpsent)
docid(tokens(corpsent))
docid(dfm(tokens(corpsent)))

quanteda

Quantitative Analysis of Textual Data

v3.0.0

GPL-3

Authors

Kenneth Benoit [cre, aut, cph] (<https://orcid.org/0000-0002-0797-564X>), Kohei Watanabe [aut] (<https://orcid.org/0000-0001-6519-5265>), Haiyan Wang [aut] (<https://orcid.org/0000-0003-4992-4311>), Paul Nulty [aut] (<https://orcid.org/0000-0002-7214-4666>), Adam Obeng [aut] (<https://orcid.org/0000-0002-2906-4775>), Stefan Müller [aut] (<https://orcid.org/0000-0002-6315-4125>), Akitaka Matsuo [aut] (<https://orcid.org/0000-0002-3323-6330>), William Lowe [aut] (<https://orcid.org/0000-0002-1549-6163>), Christian Müller [ctb], European Research Council [fnd] (ERC-2011-StG 283794-QUANTESS)

Initial release

docnames

Description

Usage

Arguments

Value

Note

See Also

Examples

quanteda

We don't support your browser anymore