Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

fill.geno

Fill holes in genotype data


Description

Replace the genotype data for a cross with a version imputed either by simulation with sim.geno, by the Viterbi algorithm with argmax.geno, or simply filling in genotypes between markers that have matching genotypes.

Usage

fill.geno(cross, method=c("imp","argmax", "no_dbl_XO", "maxmarginal"),
          error.prob=0.0001,
          map.function=c("haldane","kosambi","c-f","morgan"),
          min.prob=0.95)

Arguments

cross

An object of class cross. See read.cross for details.

method

Indicates whether to impute using a single simulation replicate from sim.geno, using the Viterbi algorithm, as implemented in argmax.geno, by simply filling in missing genotypes between markers with matching genotypes, or by choosing (at each marker) the genotype with maximal marginal probability.

error.prob

Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype).

map.function

Indicates whether to use the Haldane, Kosambi or Carter-Falconer map function when converting genetic distances into recombination fractions.

min.prob

For method="maxmarginal", genotypes with probability greater than this value will be imputed; those less than this value will be made missing.

Details

This function is written so that one may perform rough genome scans by marker regression without having to drop individuals with missing genotype data. We must caution the user that little trust should be placed in the results.

With method="imp", a single random imputation is performed, using sim.geno.

With method="argmax", for each individual the most probable sequence of genotypes, given the observed data (via argmax.geno), is used.

With method="no_dbl_XO", non-recombinant intervals are filled in; recombinant intervals are left missing. For example, a sequence of genotypes like A---A---H---H---A (with A and H corresponding to genotypes AA and AB, respectively, and with - being a missing value) will be filled in as AAAAA---HHHHH---A.

With method="maxmarginal", the conditional genotype probabilities are calculated with calc.genoprob, and then at each marker, the most probable genotype is determined. This is taken as the imputed genotype if it has probability greater than min.prob; otherwise it is made missing.

With method="no_dbl_XO" and method="maxmarginal", some missing genotypes likely remain. With method="maxmarginal", some observed genotypes may be made missing.

Value

The input cross object with the genotype data replaced by an imputed version. Any intermediate calculations (such as is produced by calc.genoprob, argmax.geno and sim.geno) are removed.

Author(s)

Karl W Broman, broman@wisc.edu

See Also

Examples

data(hyper)

out.mr <- scantwo(fill.geno(hyper,method="argmax"), method="mr")
plot(out.mr)

qtl

Tools for Analyzing QTL Experiments

v1.48-1
GPL-3
Authors
Karl W Broman <broman@wisc.edu> and Hao Wu, with ideas from Gary Churchill and Saunak Sen and contributions from Danny Arends, Robert Corty, Timothee Flutre, Ritsert Jansen, Pjotr Prins, Lars Ronnegard, Rohan Shah, Laura Shannon, Quoc Tran, Aaron Wolen, Brian Yandell, and R Core Team
Initial release
2021-03-24

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.