Genome scan with a single-QTL model
Genome scan with a single-QTL model by Haley-Knott regression or a linear mixed model, with possible allowance for covariates.
scan1( genoprobs, pheno, kinship = NULL, addcovar = NULL, Xcovar = NULL, intcovar = NULL, weights = NULL, reml = TRUE, model = c("normal", "binary"), hsq = NULL, cores = 1, ... )
genoprobs |
Genotype probabilities as calculated by
|
pheno |
A numeric matrix of phenotypes, individuals x phenotypes. |
kinship |
Optional kinship matrix, or a list of kinship matrices (one per chromosome), in order to use the LOCO (leave one chromosome out) method. |
addcovar |
An optional numeric matrix of additive covariates. |
Xcovar |
An optional numeric matrix with additional additive covariates used for null hypothesis when scanning the X chromosome. |
intcovar |
An numeric optional matrix of interactive covariates. |
weights |
An optional numeric vector of positive weights for the
individuals. As with the other inputs, it must have |
reml |
If |
model |
Indicates whether to use a normal model (least
squares) or binary model (logistic regression) for the phenotype.
If |
hsq |
Considered only if |
cores |
Number of CPU cores to use, for parallel calculations.
(If |
... |
Additional control parameters; see Details. |
We first fit the model y = Xb + e where X is a matrix of covariates (or just an intercept) and e is multivariate normal with mean 0 and covariance matrix sigmasq*[hsq*2*K+I] where K is the kinship matrix and I is the identity matrix.
We then take hsq as fixed and then scan the genome, at each genomic position fitting the model y = Xb + e where P is a matrix of genotype probabilities for the current position and again X is a matrix of covariates e is multivariate normal with mean 0 and covariance matrix sigmasq*[hsq*2*K+I], taking hsq to be known.
For each of the inputs, the row names are used as
individual identifiers, to align individuals. The genoprobs
object should have a component "is_x_chr"
that indicates
which of the chromosomes is the X chromosome, if any.
The ...
argument can contain several additional control
parameters; suspended for simplicity (or confusion, depending on
your point of view). tol
is used as a tolerance value for linear
regression by QR decomposition (in determining whether columns are
linearly dependent on others and should be omitted); default
1e-12
. intcovar_method
indicates whether to use a high-memory
(but potentially faster) method or a low-memory (and possibly
slower) method, with values "highmem"
or "lowmem"
; default
"lowmem"
. max_batch
indicates the maximum number of phenotypes
to run together; default is unlimited. maxit
is the maximum
number of iterations for converence of the iterative algorithm
used when model=binary
. bintol
is used as a tolerance for
converence for the iterative algorithm used when model=binary
.
eta_max
is the maximum value for the "linear predictor" in the
case model="binary"
(a bit of a technicality to avoid fitted
values exactly at 0 or 1).
If kinship
is absent, Haley-Knott regression is performed.
If kinship
is provided, a linear mixed model is used, with a
polygenic effect estimated under the null hypothesis of no (major)
QTL, and then taken as fixed as known in the genome scan.
If kinship
is a single matrix, then the hsq
in the results is a vector of heritabilities (one value for each phenotype). If
kinship
is a list (one matrix per chromosome), then
hsq
is a matrix, chromosomes x phenotypes.
An object of class "scan1"
: a matrix of LOD scores, positions x phenotypes.
Also contains one or more of the following attributes:
sample_size
- Vector of sample sizes used for each
phenotype
hsq
- Included if kinship
provided: A matrix of
estimated heritabilities under the null hypothesis of no
QTL. Columns are the phenotypes. If the "loco"
method was
used with calc_kinship()
to calculate a list
of kinship matrices, one per chromosome, the rows of hsq
will be the heritabilities for the different chromosomes (well,
leaving out each one). If Xcovar
was not NULL, there will at
least be an autosome and X chromosome row.
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324.
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723.
# read data iron <- read_cross2(system.file("extdata", "iron.zip", package="qtl2")) # insert pseudomarkers into map map <- insert_pseudomarkers(iron$gmap, step=1) # calculate genotype probabilities probs <- calc_genoprob(iron, map, error_prob=0.002) # grab phenotypes and covariates; ensure that covariates have names attribute pheno <- iron$pheno covar <- match(iron$covar$sex, c("f", "m")) # make numeric names(covar) <- rownames(iron$covar) Xcovar <- get_x_covar(iron) # perform genome scan out <- scan1(probs, pheno, addcovar=covar, Xcovar=Xcovar) # leave-one-chromosome-out kinship matrices kinship <- calc_kinship(probs, "loco") # genome scan with a linear mixed model out_lmm <- scan1(probs, pheno, kinship, covar, Xcovar)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.