Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

LD

Pairwise linkage disequilibrium between genetic markers.


Description

Compute pairwise linkage disequilibrium between genetic markers

Usage

LD(g1, ...)
## S3 method for class 'genotype'
LD(g1,g2,...)
## S3 method for class 'data.frame'
LD(g1,...)

Arguments

g1

genotype object or dataframe containing genotype objects

g2

genotype object (ignored if g1 is a dataframe)

...

optional arguments (ignored)

Details

Linkage disequilibrium (LD) is the non-random association of marker alleles and can arise from marker proximity or from selection bias.

LD.genotype estimates the extent of LD for a single pair of genotypes. LD.data.frame computes LD for all pairs of genotypes contained in a data frame. Before starting, LD.data.frame checks the class and number of alleles of each variable in the dataframe. If the data frame contains non-genotype objects or genotypes with more or less than 2 alleles, these will be omitted from the computation and a warning will be generated.

Three estimators of LD are computed:

  • D raw difference in frequency between the observed number of AB pairs and the expected number:

    D = p(AB) - p(A)*p(B)

  • D' scaled D spanning the range [-1,1]

    D' = D / Dmax

    where, if D > 0:

    Dmax = min( p(A)p(b), p(a)p(B) )

    or if D < 0:

    Dmax = max( -p(A)p(B), -p(a)p(b) )

  • r correlation coefficient between the markers

    r = -D / sqrt( p(A) * p(a) * p(B) * p(b) )

where

  • - p(A) is defined as the observed probability of allele 'A' for marker 1,

  • - p(a) = 1-p(A) is defined as the observed probability of allele 'a' for marker 1,

  • -p(B) is defined as the observed probability of allele 'B' for marker 2, and

  • -p(b) = 1- p(B) is defined as the observed probability of allele 'b' for marker 2, and

  • -p(AB) is defined as the probability of the marker allele pair 'AB'.

For genotype data, AB/ab cannot be distinguished from aB/Ab. Consequently, we estimate p(AB) using maximum likelihood and use this value in the computations.

Value

LD.genotype returns a 5 element list:

call

the matched call

D

Linkage disequilibrium estimate

Dprime

Scaled linkage disequilibrium estimate

corr

Correlation coefficient

nobs

Number of observations

chisq

Chi-square statistic for linkage equilibrium (i.e., D=D'=corr=0)

p.value

Chi-square p-value for marker independence

LD.data.frame returns a list with the same elements, but each element is a matrix where the upper off-diagonal elements contain the estimate for the corresponding pair of markers. The other matrix elements are NA.

Author(s)

Gregory R. Warnes greg@warnes.net

See Also

Examples

g1 <- genotype( c('T/A',    NA, 'T/T',    NA, 'T/A',    NA, 'T/T', 'T/A',
                  'T/T', 'T/T', 'T/A', 'A/A', 'T/T', 'T/A', 'T/A', 'T/T',
                     NA, 'T/A', 'T/A',   NA) )

g2 <- genotype( c('C/A', 'C/A', 'C/C', 'C/A', 'C/C', 'C/A', 'C/A', 'C/A',
                  'C/A', 'C/C', 'C/A', 'A/A', 'C/A', 'A/A', 'C/A', 'C/C',
                  'C/A', 'C/A', 'C/A', 'A/A') )


g3 <- genotype( c('T/A', 'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A',
                  'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A', 'T/T',
                  'T/A', 'T/A', 'T/A', 'T/T') )

# Compute LD on a single pair

LD(g1,g2)

# Compute LD table for all 3 genotypes

data <- makeGenotypes(data.frame(g1,g2,g3))
LD(data)

genetics

Population Genetics

v1.3.8.1.3
GPL
Authors
Gregory Warnes, with contributions from Gregor Gorjanc, Friedrich Leisch, and Michael Man.
Initial release
2012-11-26

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.