Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

cast_text

Cast annotations to text


Description

Cast labeled tokens to sentences.

Usage

cast_text(tokens, annotation, ..., text_col = "token", na.rm = T)

Arguments

tokens

A tokenIndex

annotation

The name of annotations (the "column" argument in annotate_tqueries)

...

Optionally, group annotations together. Named arguments can be given where the name is the new group, and the value is a character vector with values in the annotation column. For example, text = c('verb','predicate') would group the 'verb' and 'predicate' nodes together under the name 'text'.

text_col

The name of the column in tokens with the text. Usually this is "token", but some parsers use alternatives such as 'word'.

na.rm

If true (default), drop tokens where annotation id is NA (i.e. tokens without labels)

Value

a data.table

Examples

tokens = tokens_spacy[tokens_spacy$doc_id == 'text3',]

## two simple example tqueries
passive = tquery(pos = "VERB*", label = "verb", fill=FALSE,
                 children(relation = "agent",
                          children(label="subject")),
                 children(relation = "nsubjpass", label="object"))
active =  tquery(pos = "VERB*", label = "verb", fill=FALSE,
                 children(relation = c("nsubj", "nsubjpass"), label = "subject"),
                 children(relation = "dobj", label="object"))

tokens = annotate_tqueries(tokens, "clause", pas=passive, act=active, overwrite=T)

cast_text(tokens, 'clause')

## group annotations
cast_text(tokens, 'clause', text = c('verb','object'))

## use grouping to sort
cast_text(tokens, 'clause', subject = 'subject', 
                            verb = 'verb', object = 'object')

rsyntax

Extract Semantic Relations from Text by Querying and Reshaping Syntax

v0.1.1
GPL-3
Authors
Kasper Welbers and Wouter van Atteveldt
Initial release
2020-10-22

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.