Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

LD.thin

LD thinning


Description

Select SNPs in LD below a given threshold.

Usage

LD.thin(x, threshold, max.dist = 250e3, beg = 1, end = ncol(x),
        which.snps, dist.unit = c("bases", "indices", "cM"), 
        extract = TRUE, keep = c("left", "right", "random"))

Arguments

x

A bed.matrix

threshold

The maximum LD (measured by r^2) between SNPs

max.dist

The maximum distance for which the LD is computed

beg

The index of the first SNP to consider

end

The index of the last SNP to consider

which.snps

Logical vector, giving which SNPs are considerd. The default is to use all SNPs

dist.unit

Distance unit in max.dist

extract

A logical indicating whether the function return a bed.matrix (TRUE) or a logical vector indicating which SNPs are selected (FALSE)

keep

Which SNP is selected in a pair with LD above threshold

Details

The SNPs to keep are selected by a greedy algorithm. The LD is computed only for SNP pairs for which distance is inferior to max.dist, expressed in number of bases if dist.unit = "bases", in number of SNPs if dist.unit = "indices", or in centiMorgan if dist.unit = "cM".

The argument which.snps allows to consider only a subset of SNPs.

The algorithm tries to keep the largest possible number of SNPs: it is not appropriate to select tag-SNPs.

Value

If extract = TRUE, a bed.matrix extracted from x with SNPs in pairwise LD below the given threshold. If extract = FALSE, a logical vector of length end - beg + 1, where TRUE indicates that the corresponding SNPs is selected.

Author(s)

Hervé Perdry and Claire Dandine-Roulland

See Also

Examples

# Load data
data(TTN)
x <- as.bed.matrix(TTN.gen, TTN.fam, TTN.bim)

# Select SNPs in LD r^2 < 0.4, max.dist = 500 kb
y <- LD.thin(x, threshold = 0.4, max.dist = 500e3)
y

# Verifies that there is no SNP pair with LD r^2 > 0.4
# (note that the matrix ld.y has ones on the diagonal)
ld.y <- LD( y, lim = c(1, ncol(y)) )
sum( ld.y > 0.4 )

gaston

Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models

v1.5.7
GPL-3
Authors
Hervé Perdry [cre, aut, cph], Claire Dandine-Roulland [aut, cph], Deepak Bandyopadhyay [cph] (C++ gzstream class), Lutz Kettner [cph] (C++ gzstream class)
Initial release
2020-09-18

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.