qtl2: scan1 – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

scan1

Genome scan with a single-QTL model

Description

Genome scan with a single-QTL model by Haley-Knott regression or a linear mixed model, with possible allowance for covariates.

Usage

scan1(
  genoprobs,
  pheno,
  kinship = NULL,
  addcovar = NULL,
  Xcovar = NULL,
  intcovar = NULL,
  weights = NULL,
  reml = TRUE,
  model = c("normal", "binary"),
  hsq = NULL,
  cores = 1,
  ...
)

Arguments

`genoprobs`	Genotype probabilities as calculated by `calc_genoprob()`.
`pheno`	A numeric matrix of phenotypes, individuals x phenotypes.
`kinship`	Optional kinship matrix, or a list of kinship matrices (one per chromosome), in order to use the LOCO (leave one chromosome out) method.
`addcovar`	An optional numeric matrix of additive covariates.
`Xcovar`	An optional numeric matrix with additional additive covariates used for null hypothesis when scanning the X chromosome.
`intcovar`	An numeric optional matrix of interactive covariates.
`weights`	An optional numeric vector of positive weights for the individuals. As with the other inputs, it must have `names` for individual identifiers.
`reml`	If `kinship` provided: if `reml=TRUE`, use REML; otherwise maximum likelihood.
`model`	Indicates whether to use a normal model (least squares) or binary model (logistic regression) for the phenotype. If `model="binary"`, the phenotypes must have values in [0, 1].
`hsq`	Considered only if `kinship` is provided, in which case this is taken as the assumed value for the residual heritability. It should be a vector with length corresponding to the number of columns in `pheno`, or (if `kinship` corresponds to a list of LOCO kinship matrices) a matrix with dimension `length(kinship) x ncol(pheno)`.
`cores`	Number of CPU cores to use, for parallel calculations. (If `0`, use `parallel::detectCores()`.) Alternatively, this can be links to a set of cluster sockets, as produced by `parallel::makeCluster()`.
`...`	Additional control parameters; see Details.

Details

We first fit the model y = Xb + e where X is a matrix of covariates (or just an intercept) and e is multivariate normal with mean 0 and covariance matrix sigmasq*[hsq*2*K+I] where K is the kinship matrix and I is the identity matrix.

We then take hsq as fixed and then scan the genome, at each genomic position fitting the model y = Xb + e where P is a matrix of genotype probabilities for the current position and again X is a matrix of covariates e is multivariate normal with mean 0 and covariance matrix sigmasq*[hsq*2*K+I], taking hsq to be known.

For each of the inputs, the row names are used as individual identifiers, to align individuals. The genoprobs object should have a component "is_x_chr" that indicates which of the chromosomes is the X chromosome, if any.

The ... argument can contain several additional control parameters; suspended for simplicity (or confusion, depending on your point of view). tol is used as a tolerance value for linear regression by QR decomposition (in determining whether columns are linearly dependent on others and should be omitted); default 1e-12. intcovar_method indicates whether to use a high-memory (but potentially faster) method or a low-memory (and possibly slower) method, with values "highmem" or "lowmem"; default "lowmem". max_batch indicates the maximum number of phenotypes to run together; default is unlimited. maxit is the maximum number of iterations for converence of the iterative algorithm used when model=binary. bintol is used as a tolerance for converence for the iterative algorithm used when model=binary. eta_max is the maximum value for the "linear predictor" in the case model="binary" (a bit of a technicality to avoid fitted values exactly at 0 or 1).

If kinship is absent, Haley-Knott regression is performed. If kinship is provided, a linear mixed model is used, with a polygenic effect estimated under the null hypothesis of no (major) QTL, and then taken as fixed as known in the genome scan.

If kinship is a single matrix, then the hsq in the results is a vector of heritabilities (one value for each phenotype). If kinship is a list (one matrix per chromosome), then hsq is a matrix, chromosomes x phenotypes.

Value

An object of class "scan1": a matrix of LOD scores, positions x phenotypes. Also contains one or more of the following attributes:

sample_size - Vector of sample sizes used for each phenotype
hsq - Included if kinship provided: A matrix of estimated heritabilities under the null hypothesis of no QTL. Columns are the phenotypes. If the "loco" method was used with calc_kinship() to calculate a list of kinship matrices, one per chromosome, the rows of hsq will be the heritabilities for the different chromosomes (well, leaving out each one). If Xcovar was not NULL, there will at least be an autosome and X chromosome row.

References

Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324.

Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723.

Examples

# read data
iron <- read_cross2(system.file("extdata", "iron.zip", package="qtl2"))


# insert pseudomarkers into map
map <- insert_pseudomarkers(iron$gmap, step=1)

# calculate genotype probabilities
probs <- calc_genoprob(iron, map, error_prob=0.002)

# grab phenotypes and covariates; ensure that covariates have names attribute
pheno <- iron$pheno
covar <- match(iron$covar$sex, c("f", "m")) # make numeric
names(covar) <- rownames(iron$covar)
Xcovar <- get_x_covar(iron)

# perform genome scan
out <- scan1(probs, pheno, addcovar=covar, Xcovar=Xcovar)

# leave-one-chromosome-out kinship matrices
kinship <- calc_kinship(probs, "loco")

# genome scan with a linear mixed model
out_lmm <- scan1(probs, pheno, kinship, covar, Xcovar)

qtl2

Quantitative Trait Locus Mapping in Experimental Crosses

v0.24

GPL-3

Authors

Karl W Broman [aut, cre] (<https://orcid.org/0000-0002-4914-6671>), R Core Team [ctb]

Initial release

2020-12-16