quanteda: object2id – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

quanteda

object2id

Match quanteda objects against token types

Description

Developer function to match patterns in quanteda objects against token types.

Usage

object2id(
  x,
  types,
  valuetype = c("glob", "fixed", "regex"),
  case_insensitive = TRUE,
  concatenator = "_",
  levels = 1,
  remove_unigram = FALSE,
  keep_nomatch = FALSE
)

object2fixed(
  x,
  types,
  valuetype = c("glob", "fixed", "regex"),
  case_insensitive = TRUE,
  concatenator = "_",
  levels = 1,
  remove_unigram = FALSE,
  keep_nomatch = FALSE
)

Arguments

`x`	a list of character vectors, dictionary or collocations object
`types`	token types against which patterns are matched
`valuetype`	the type of pattern matching: `"glob"` for "glob"-style wildcard expressions; `"regex"` for regular expressions; or `"fixed"` for exact matching. See valuetype for details.
`case_insensitive`	logical; if `TRUE`, ignore case when matching a `pattern` or dictionary values
`concatenator`	the concatenation character that join multi-word expression in `types`
`levels`	integers specifying the levels of entries in a hierarchical dictionary that will be applied. The top level is 1, and subsequent levels describe lower nesting levels. Values may be combined, even if these levels are not contiguous, e.g. `levels = c(1:3)` will collapse the second level into the first, but record the third level (if present) collapsed below the first (see examples).
`remove_unigram`	if `TRUE`, ignores single-word patterns
`keep_nomatch`	keep patterns that did not match

Value

a list of integer vectors containing indices of matched types

Examples

types <- c("A", "AA", "B", "BB", "B_B", "C", "C-C")

# dictionary
dict <- dictionary(list(A = c("a", "aa"), 
                        B = c("BB", "B B"),
                        C = c("C", "C-C")))
object2fixed(dict, types)
object2fixed(dict, types, remove_unigram = TRUE)

# phrase
pats <- phrase(c("a", "aa", "zz", "bb", "b b"))
object2fixed(pats, types)
object2fixed(pats, types, keep_nomatch = TRUE)

quanteda

Quantitative Analysis of Textual Data

v3.0.0

GPL-3

Authors

Kenneth Benoit [cre, aut, cph] (<https://orcid.org/0000-0002-0797-564X>), Kohei Watanabe [aut] (<https://orcid.org/0000-0001-6519-5265>), Haiyan Wang [aut] (<https://orcid.org/0000-0003-4992-4311>), Paul Nulty [aut] (<https://orcid.org/0000-0002-7214-4666>), Adam Obeng [aut] (<https://orcid.org/0000-0002-2906-4775>), Stefan Müller [aut] (<https://orcid.org/0000-0002-6315-4125>), Akitaka Matsuo [aut] (<https://orcid.org/0000-0002-3323-6330>), William Lowe [aut] (<https://orcid.org/0000-0002-1549-6163>), Christian Müller [ctb], European Research Council [fnd] (ERC-2011-StG 283794-QUANTESS)

Initial release

object2id

Description

Usage

Arguments

Value

See Also

Examples

quanteda

We don't support your browser anymore