Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

genoImpute

Impute Genotypic Data


Description

Impute missing genotypic data in advance intercross lines (AIL).

Usage

genoImpute(gdat, gmap, step, prd = NULL, gr = 2, pos = NULL,
   method = c("Haldane", "Kosambi"), na.str = "NA", msg = FALSE)

Arguments

gdat

Genotype data. Should be a matrix or a data frame, with each row representing an observation and each column a marker locus. The column names should be marker names. Genotypes can be 1, 2 and 3, or "AA", "AB" and "BB". Optional if an object prd from genoProb is used as an argument.

gmap

A genetic map. Should be data frame (snp, chr, dist,...), where "snp" is the SNP (marker) name, "chr" is the chromosome where the "snp" is, and "dist" is the genetic distance in centi-Morgan (cM) from the left of the chromosome.

step

Optional. If specified, it is the maximum distance (in cM) between two adjacent loci for which the probabilities are calculated. The distance corresponds to the "cumulative" recombination rate at gr-th generation. If missing, only

prd

An object from genoProb if not NULL. See "details" for more information.

gr

The generation under consideration.

pos

Data frame (chr, dist, snp, ...). If given, step will be ignored.

method

Whether "Haldane" or "Kosambi" mapping function should be used.

na.str

String for missing values.

msg

A logical variable. If TRUE, certain information will be printed out during calculation.

Details

The missing genotypic value is randomly assigned with a probability conditional on the genotypes of the flanking SNPs (makers).

An object, prd, from genoProb alone can be used for the purpose of imputation. Then, the output (especially the putative loci) will be determined by prd. Optionally, it can be used together with gdat so that missing values in gdat will be imputed if possible, depending on whether loci in the columns of gdat can be identified in the third dimension of prd; this won't change the original genotypic data. See examples.

Value

A matrix with the number of rows being the same as gdat and with the number of columns depending on the SNP set in both gdat and gmap and the step length.

Note

Currently only suitable for advanced intercross lines.

See Also

Examples

data(miscEx)

# briefly look at genotype data
sum(is.na(gdatF8))
gdatF8[1:5,1:5]

## Not run: 
# run 'genoProb'
gdtmp<- gdatF8
   gdtmp<- replace(gdtmp,is.na(gdtmp),0)
prDat<- genoProb(gdat=gdtmp, gmap=gmapF8, gr=8, method="Haldane", msg=TRUE)

# imputation based on 'genoProb' object
tmp<- genoImpute(prd=prDat)
sum(is.na(tmp))
tmp[1:5,1:5]

# imputation based on both genotype data and 'genoProb' object
tmp<- genoImpute(gdatF8, prd=prDat)
sum(is.na(tmp))
tmp[1:5,1:5]

# imputation based on genotype data
tmp<- genoImpute(gdatF8, gmap=gmapF8, gr=8, na.str=NA)
sum(is.na(tmp))
tmp[1:5, 1:5]
# set "msg=TRUE" for more information
tmp<- genoImpute(gdatF8, gmap=gmapF8, gr=8, na.str=NA, msg=TRUE)
sum(is.na(tmp))
tmp[1:5, 1:5]

## End(Not run)

QTLRel

Tools for Mapping of Quantitative Traits of Genetically Related Individuals and Calculating Identity Coefficients from Pedigrees

v1.11
GPL (>= 2)
Authors
Riyan Cheng [aut, cre]
Initial release
2022-6-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.