stringdist: seq_sim – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

stringdist

seq_sim

Compute similarity scores between sequences of integers

Description

Compute similarity scores between sequences of integers

Usage

seq_sim(
  a,
  b,
  method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw"),
  q = 1,
  ...
)

Arguments

`a`	`list` of `integer` vectors (target)
`b`	`list` of `integer` vectors (source). Optional for `seq_distmatrix`.
`method`	Method for distance calculation. The default is `"osa"`, see `stringdist-metrics`.
`q`	Size of the q-gram; must be nonnegative. Only applies to `method='qgram'`, `'jaccard'` or `'cosine'`.
`...`	additional arguments are passed on to `seq_dist`.

Value

A numeric vector of length max(length(a),length(b)). If one of the entries in a or b is NA_integer_, all comparisons with that element result in NA. Missings occurring within the sequences are treated as an ordinary number (the representation of NA_integer_).

Examples

L1 <- list(1:3,2:4)
L2 <- list(1:3)
seq_sim(L1,L2,method="osa")

# note how missing values are handled (L2 is recycled over L1)
L1 <- list(c(1L,NA_integer_,3L),2:4,NA_integer_)
L2 <- list(1:3)
seq_sim(L1,L2)

stringdist

Approximate String Matching, Fuzzy Text Search, and String Distance Functions

v0.9.6.3

GPL-3

Authors

Mark van der Loo [aut, cre] (<https://orcid.org/0000-0002-9807-4686>), Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb], Chris Muir [ctb], Johannes Gruber [ctb]

Initial release

seq_sim

Description

Usage

Arguments

Value

See Also

Examples

stringdist

We don't support your browser anymore