Function fitBLUP
Solves the Linear Mixed Model and calculates the Best Linear Unbiased Predictor (BLUP)
fitBLUP(y, X = NULL, Z = NULL, K = NULL, U = NULL, d = NULL, indexK = NULL, h2 = NULL, BLUP = TRUE, method = c("REML","ML"), return.Hinv = FALSE, tol = 1E-5, maxIter = 1000, interval = c(1E-9,1E9), warn = TRUE)
y |
(numeric vector) Response variable |
X |
(numeric matrix) Design matrix for fixed effects. When |
Z |
(numeric matrix) Design matrix for random effects. When |
K |
(numeric matrix) Kinship relationships. This can be a name of a binary file where the matrix is stored |
U |
(numeric matrix) Eigenvectors from spectral value decomposition of G = U D U' |
d |
(numeric vector) Eigenvalues from spectral value decomposition of G = U D U' |
indexK |
(integer vector) Which columns and rows will be read when |
h2 |
(numeric) Heritability of the response variable. When it is not |
BLUP |
|
method |
(character) Either 'REML' (Restricted Maximum Likelihood) or 'ML' (Maximum Likelihood) |
return.Hinv |
|
tol |
(numeric) Maximum error between two consecutive solutions (convergence tolerance) when finding the root of the log-likelihood's first derivative |
maxIter |
(integer) Maximum number of iterations to run before convergence is reached |
interval |
(numeric vector) Range of values in which the root is searched |
warn |
|
The basic linear mixed model that relates phenotypes with genetic values is of the form
where y is a vector with the response, b is the vector of fixed effects, u is the vector of the (random) genetic values of the genotypes, e is the vector of environmental residuals (random error), and X and Z are design matrices conecting the fixed and genetic effects with replicates. Genetic values are assumed to follow a Normal distribution as u ~ N(0,σ2uK), and the error terms are assumed e ~ N(0,σ2eI).
The resulting vector of genetic values g = Z u will therefore follow g ~ N(0,σ2uG) where G = Z K Z'. In the un-replicated case, Z = I is an identity matrix, and hence g = u and G = K.
Only information from observed data (training set) is used to derive predictions of the values utrn = (ui), i = 1,2,...,ntrn, as
where H is a matrix of weights given by
where Gtrn is the sub-matrix corresponding to the training set, and λ0 = σ2e/σ2u is the variance components ratio representing a shrinkage parameter. This parameter is expressed in terms of the heritability, h2 = σ2u/(σ2u + σ2e), as λ0 = (1 - h2)/h2.
The prediction of values utst corresponding to un-observed data (testing set) can be done by using
where Gtst,trn is the sub-matrix of G corresponding to the testing set (in rows) and training set (in columns).
Solutions are found using the GEMMA (Genome-wide Efficient Mixed Model Analysis) approach (Zhou & Stephens, 2012). GEMMA implements the Brent's method to efficiently optimize the first derivative of the log-likelihood to solve for the variance components ratio.
Returns a list object that contains the elements:
b
: (vector) fixed effects solutions (including the intercept).
u
: (vector) random effects solutions.
varU
: random effect variance.
varE
: error variance.
h2
: heritability.
convergence
: (logical) whether Brent's method converged.
method
: either 'REML' or 'ML' method used.
Paulino Perez-Rodriguez, Marco Lopez-Cruz (maraloc@gmail.com) and Gustavo de los Campos
VanRaden PM (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91(11), 4414–4423.
Zhou X, Stephens M (2012). Genome-wide efficient mixed-model analysis for association studies. Nature Genetics, 44(7), 821-824
require(SFSI) data(wheatHTP) X = scale(X[1:300,])/sqrt(ncol(X)) # Subset and scale markers G = tcrossprod(X) # Genomic relationship matrix y = as.vector(scale(Y[1:300,"YLD"])) # Subset response variable # Training and testing sets tst = 1:ceiling(0.3*length(y)) trn = seq_along(y)[-tst] yNA <- y yNA[tst] <- NA fm1 = fitBLUP(yNA, K=G) plot(y[tst],fm1$u[tst]) # Predicted vs observed values in testing set cor(y[tst],fm1$u[tst]) # Prediction accuracy in testing set cor(y[trn],fm1$u[trn]) # Prediction accuracy in training set fm1$h2 # Heritability if(requireNamespace("float")){ # Using a 'float' type variable G2 = float::fl(G) fm2 = fitBLUP(yNA, K=G2) max(abs(fm1$u-fm2$u)) # Check for discrepances }
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.