sequoia: MkGenoErrors – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

sequoia

MkGenoErrors

Simulate Genotyping Errors

Description

Generate errors and missing values in a (simulated) genotype matrix.

Usage

MkGenoErrors(
  SGeno,
  CallRate = 0.99,
  SnpError = 5e-04,
  ErrorFM = function(E) {     matrix(c(1 - E - (E/2)^2, E, (E/2)^2, E/2, 1 - E, E/2,
    (E/2)^2, E, 1 - E - (E/2)^2), 3, 3, byrow = TRUE) },
  Error.shape = 0.5,
  CallRate.shape = 1
)

Arguments

`SGeno`	matrix with genotype data in Sequoia's format: 1 row per individual, 1 column per SNP, and genotypes coded as 0/1/2.
`CallRate`	either a single number for the mean call rate (genotyping success), OR a vector with the call rate at each SNP, OR a named vector with the call rate for each individual. In the third case, ParMis is ignored, and individuals in the pedigree (as id or parent) not included in this vector are presumed non-genotyped.
`SnpError`	mean per-locus genotyping error rate across SNPs, and a beta-distribution will be used to simulate the number of missing cases per SNP, OR a vector with the genotyping error for each SNP.
`ErrorFM`	function taking the error rate (scalar) as argument and returning a 4x4 or 3x3 matrix with probabilities that actual genotype i (rows) is observed as genotype j (columns).
`Error.shape`	first shape parameter (alpha) of beta-distribution of per-SNP error rates. A higher value results in a flatter distribution.
`CallRate.shape`	as Error.shape, for per-SNP call rates.

Value

The input genotype matrix, with some genotypes replaced, and some set to missing (-9).

Examples

data(Ped_HSg5)
GenoM <- SimGeno(Ped = Ped_HSg5, nSnp = 100, ParMis = 0.2,
                 SnpError=0, CallRate=1)
GenoM.actual <- GenoM
LowQ <- sample.int(nrow(GenoM), 42)  # low-quality samples
GenoM[LowQ, ] <- MkGenoErrors(GenoM[LowQ, ], SnpError = 0.05)
GenoM[-LowQ, ] <- MkGenoErrors(GenoM[-LowQ, ], SnpError = 0.001)
ErrorCount <- sapply(1:nrow(GenoM), function(i) {
  sum(GenoM.actual[i,] != GenoM[i,] & GenoM[i,] != -9) } )
mean(ErrorCount[LowQ])
mean(ErrorCount[-LowQ])

sequoia

Pedigree Inference from SNPs

v2.3.3

GPL-2

Authors

Jisca Huisman [aut, cre]

Initial release

2021-04-30

MkGenoErrors

Description

Usage

Arguments

Value

Examples

sequoia

We don't support your browser anymore