Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

textstat_proxy

[Experimental] Compute document/feature proximity


Description

This is an underlying function for textstat_dist and textstat_simil but returns TsparseMatrix.

Usage

textstat_proxy(
  x,
  y = NULL,
  margin = c("documents", "features"),
  method = c("cosine", "correlation", "jaccard", "ejaccard", "dice", "edice", "hamman",
    "simple matching", "euclidean", "chisquared", "hamming", "kullback", "manhattan",
    "maximum", "canberra", "minkowski"),
  p = 2,
  min_proxy = NULL,
  rank = NULL,
  use_na = FALSE
)

Arguments

x

a dfm objects; y is an optional target matrix matching x in the margin on which the similarity or distance will be computed.

y

if a dfm object is provided, proximity between documents or features in x and y is computed.

margin

identifies the margin of the dfm on which similarity or difference will be computed: "documents" for documents or "features" for word/term features.

method

character; the method identifying the similarity or distance measure to be used; see Details.

p

The power of the Minkowski distance.

min_proxy

the minimum proximity value to be recoded.

rank

an integer value specifying top-n most proximity values to be recorded.

use_na

if TRUE, return NA for proximity to empty vectors. Note that use of NA makes the proximity matrices denser.

See Also


quanteda.textstats

Textual Statistics for the Quantitative Analysis of Textual Data

v0.94.1
GPL-3
Authors
Kenneth Benoit [cre, aut, cph] (<https://orcid.org/0000-0002-0797-564X>), Kohei Watanabe [aut] (<https://orcid.org/0000-0001-6519-5265>), Haiyan Wang [aut] (<https://orcid.org/0000-0003-4992-4311>), Jiong Wei Lua [aut], Jouni Kuha [aut] (<https://orcid.org/0000-0002-1156-8465>), European Research Council [fnd] (ERC-2011-StG 283794-QUANTESS)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.