Data Frame Source
Create a data frame source.
DataframeSource(x)
x |
A data frame giving the texts and metadata. |
A data frame source interprets each row of the data frame x as a
document. The first column must be named "doc_id" and contain a unique
string identifier for each document. The second column must be named
"text" and contain a UTF-8 encoded string representing the
document's content. Optional additional columns are used as document level
metadata.
An object inheriting from DataframeSource, SimpleSource,
and Source.
readtext for reading in a text in multiple formats
suitable to be processed by DataframeSource.
docs <- data.frame(doc_id = c("doc_1", "doc_2"),
text = c("This is a text.", "This another one."),
dmeta1 = 1:2, dmeta2 = letters[1:2],
stringsAsFactors = FALSE)
(ds <- DataframeSource(docs))
x <- Corpus(ds)
inspect(x)
meta(x)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.