Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

subset_query

Subset tCorpus token data using a query


Description

A convenience function that searches for contexts (documents, sentences), and uses the results to subset the tCorpus token data.

Usage

subset_query(
  tc,
  query,
  feature = "token",
  context_level = c("document", "sentence"),
  not = F,
  as_ascii = F,
  window = NA
)

Arguments

tc

A tCorpus

query

A character string that is a query. See search_contexts for query syntax.

feature

The name of the feature columns on which the query is used.

context_level

Select whether the query and subset are performed at the document or sentence level.

not

If TRUE, perform a NOT search. Return the articles/sentences for which the query is not found.

as_ascii

if TRUE, perform search in ascii.

window

If used, uses a word distance as the context (overrides context_level)

Details

See the documentation for search_contexts for an explanation of the query language.

Examples

text = c('A B C', 'D E F. G H I', 'A D', 'GGG')
tc = create_tcorpus(text, doc_id = c('a','b','c','d'), split_sentences = TRUE)

## subset by reference
tc2 = subset_query(tc, 'A')
tc2$meta

corpustools

Managing, Querying and Analyzing Tokenized Text

v0.4.10
GPL-3
Authors
Kasper Welbers and Wouter van Atteveldt
Initial release
2022-05-03

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.