Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

seq_sim

Compute similarity scores between sequences of integers


Description

Compute similarity scores between sequences of integers

Usage

seq_sim(
  a,
  b,
  method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw"),
  q = 1,
  ...
)

Arguments

a

list of integer vectors (target)

b

list of integer vectors (source). Optional for seq_distmatrix.

method

Method for distance calculation. The default is "osa", see stringdist-metrics.

q

Size of the q-gram; must be nonnegative. Only applies to method='qgram', 'jaccard' or 'cosine'.

...

additional arguments are passed on to seq_dist.

Value

A numeric vector of length max(length(a),length(b)). If one of the entries in a or b is NA_integer_, all comparisons with that element result in NA. Missings occurring within the sequences are treated as an ordinary number (the representation of NA_integer_).

See Also

Examples

L1 <- list(1:3,2:4)
L2 <- list(1:3)
seq_sim(L1,L2,method="osa")

# note how missing values are handled (L2 is recycled over L1)
L1 <- list(c(1L,NA_integer_,3L),2:4,NA_integer_)
L2 <- list(1:3)
seq_sim(L1,L2)

stringdist

Approximate String Matching, Fuzzy Text Search, and String Distance Functions

v0.9.6.3
GPL-3
Authors
Mark van der Loo [aut, cre] (<https://orcid.org/0000-0002-9807-4686>), Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb], Chris Muir [ctb], Johannes Gruber [ctb]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.