polmineR: size-method – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

polmineR

size-method

Get Number of Tokens.

Description

The method will get the number of tokens in a corpus or partition, or the dispersion across one or more s-attributes.

Usage

size(x, ...)

## S4 method for signature 'corpus'
size(x, s_attribute = NULL, verbose = TRUE, ...)

## S4 method for signature 'character'
size(x, s_attribute = NULL, verbose = TRUE, ...)

## S4 method for signature 'partition'
size(x, s_attribute = NULL, ...)

## S4 method for signature 'partition_bundle'
size(x)

## S4 method for signature 'DocumentTermMatrix'
size(x)

## S4 method for signature 'TermDocumentMatrix'
size(x)

## S4 method for signature 'features'
size(x)

## S4 method for signature 'remote_corpus'
size(x)

## S4 method for signature 'remote_partition'
size(x)

Arguments

`x`	An object to get size(s) for.
`...`	Further arguments (used only for backwards compatibility).
`s_attribute`	A `character` vector with s-attributes (one or more).
`verbose`	A `logical` value, whether to output messages.

Details

One or more s-attributes can be provided to get the dispersion of tokens across one or more dimensions. Two or more s-attributes can lead to reasonable results only if the corpus XML is flat.

The size-method for features objects will return a named list with the size of the corpus of interest ("coi"), i.e. the number of tokens in the window, and the reference corpus ("ref"), i.e. the number of tokens that are not matched by the query and that are outside the window.

Value

If .Object is a corpus (a corpus object or specified by corpus id), an integer vector if argument s_attribute is NULL, a two-column data.table otherwise (first column is the s-attribute, second column: "size"). If .Object is a subcorpus_bundle or a partition_bundle, a data.table (with columns "name" and "size").

Examples

use("polmineR")

# for corpus object
corpus("REUTERS") %>% size()
corpus("REUTERS") %>% size(s_attribute = "id")
corpus("GERMAPARLMINI") %>% size(s_attribute = c("date", "party"))

# for corpus specified by ID
size("GERMAPARLMINI")
size("GERMAPARLMINI", s_attribute = "date")
size("GERMAPARLMINI", s_attribute = c("date", "party"))

# for partition object
P <- partition("GERMAPARLMINI", date = "2009-11-11")
size(P, s_attribute = "speaker")
size(P, s_attribute = "party")
size(P, s_attribute = c("speaker", "party"))

# for subcorpus
sc <- corpus("GERMAPARLMINI") %>% subset(date == "2009-11-11")
size(sc, s_attribute = "speaker")
size(sc, s_attribute = "party")
size(sc, s_attribute = c("speaker", "party"))

# for subcorpus_bundle
subcorpora <- corpus("GERMAPARLMINI") %>% split(s_attribute = "date")
size(subcorpora)

polmineR

Verbs and Nouns for Corpus Analysis

v0.8.5

GPL-3

Authors

Andreas Blaette [aut, cre] (<https://orcid.org/0000-0001-8970-8010>), Christoph Leonhardt [ctb]

Initial release

2020-09-22

size-method

Description

Usage

Arguments

Details

Value

See Also

Examples

polmineR

We don't support your browser anymore