Functions to add or retrieve corpus summary metadata
Functions to add or retrieve corpus summary metadata
add_summary_metadata(x, extended = FALSE, ...) get_summary_metadata(x, ...) summarize_texts_extended(x, stop_words = stopwords("en"), n = 100)
This is provided so that a corpus object can be stored with
summary information to avoid having to compute this every time
[summary.corpus()]
is called.
So in future calls, if !is.null(meta(x, "summary", type = "system") && !length(list(...))
, then summary.corpus()
will simply return
get_system_meta()
rather than compute the summary statistics on the fly,
which requires tokenizing the text.
add_summary_metadata()
returns a corpus with summary metadata added
as a data.frame, with the top-level list element names summary
.
get_summary_metadata()
returns the summary metadata as a data.frame.
summarize_texts_extended()
returns extended summary information.
corp <- corpus(data_char_ukimmig2010) corp <- quanteda:::add_summary_metadata(corp) quanteda:::get_summary_metadata(corp) ## Not run: # using extended summary extended_data <- quanteda:::summarize_texts_extended(data_corpus_inaugural) textplot_wordcloud(extended_data$top_dfm, max_words = 100) \dontrun{ library("ggplot2") ggplot(data.frame(all_tokens = extended_data$all_tokens), aes(x = all_tokens)) + geom_histogram(color = "darkblue", fill = "lightblue") + xlab("Total length in tokens") } ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.