corpustools: add_multitoken_label – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

add_multitoken_label

Choose and add multitoken strings based on multitoken categories

Description

Given a multitoken category (e.g., named entity ids), this function finds the most frequently occuring string in this category and adds it as a label for the category

Usage

add_multitoken_label(
  tc,
  colloc_id,
  feature = "token",
  new_feature = sprintf("%s_l", colloc_id),
  pref_subset = NULL
)

Arguments

`tc`	a tcorpus object
`colloc_id`	the data column containing the unique id for multitoken tokens
`feature`	the name of the feature column
`new_feature`	the name of the new feature column
`pref_subset`	Optionally, a subset call, to specify a subset that has priority for finding the most frequently occuring string

corpustools

Managing, Querying and Analyzing Tokenized Text

v0.4.10

GPL-3

Authors

Kasper Welbers and Wouter van Atteveldt

Initial release

2022-05-03

add_multitoken_label

Description

Usage

Arguments

corpustools

We don't support your browser anymore