rsyntax: as_tokenindex – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

as_tokenindex

Prepare a tokenIndex

Description

Creates a tokenIndex data.table. Accepts any data.frame given that the required columns (doc_id, sentence, token_id, parent, relation) are present. The names of these columns must be one of the values specified in the respective arguments.

The data in the data.frame will not be changed, with three exceptions. First, the columnnames will be changed if the default values are not used. Second, if a token has itself as its parent (which in some parsers is used to indicate the root), the parent is set to NA (as used in other parsers) to prevent infinite cycles. Third, the data will be sorted by doc_id, sentence, token_id.

Usage

as_tokenindex(
  tokens,
  doc_id = c("doc_id", "document_id"),
  sentence = c("sentence", "sentence_id"),
  token_id = c("token_id"),
  parent = c("parent", "head_token_id"),
  relation = c("relation", "dep_rel")
)

Arguments

`tokens`	A data.frame, data.table, or tokenindex.
`doc_id`	candidate names for the document id columns
`sentence`	candidate names for sentence (id/index) column
`token_id`	candidate names for the token id column. Has to be numeric (Some parsers return token_id's as numbers with a prefix (t_1, w_1))
`parent`	candidate names for the parent id column. Has to be numeric
`relation`	candidate names for the relation column

Value

a tokenIndex

Examples

as_tokenindex(tokens_corenlp)

rsyntax

Extract Semantic Relations from Text by Querying and Reshaping Syntax

v0.1.1

GPL-3

Authors

Kasper Welbers and Wouter van Atteveldt

Initial release

2020-10-22