Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

draw_all_admix

Simulate random allele frequencies and genotypes from the BN-PSD admixture model


Description

This function returns simulated ancestral, intermediate, and individual-specific allele frequencies and genotypes given the admixture structure, as determined by the admixture proportions and the vector of intermediate subpopulation FST values. The function is a wrapper around \link{draw_p_anc}, \link{draw_p_subpops}, \link{make_p_ind_admix}, and \link{draw_genotypes_admix} with additional features such as requiring polymorphic loci. Importantly, by default fixed loci (where all individuals were homozygous for the same allele) are re-drawn from the start (starting from the ancestral allele frequencies) so no fixed loci are in the output and no biases are introduced by re-drawing genotypes conditional on any of the previous allele frequencies (ancestral, intermediate, or individual-specific). Below m_loci (also m) is the number of loci, n is the number of individuals, and k is the number of intermediate subpopulations.

Usage

draw_all_admix(
  admix_proportions,
  inbr_subpops,
  m_loci,
  want_genotypes = TRUE,
  want_p_ind = FALSE,
  want_p_subpops = FALSE,
  want_p_anc = TRUE,
  verbose = FALSE,
  require_polymorphic_loci = TRUE,
  beta = NA,
  p_anc = NULL
)

Arguments

admix_proportions

The n-by-k matrix of admixture proportions.

inbr_subpops

The length-k vector (or scalar) of intermediate subpopulation FST values.

m_loci

The number of loci to draw.

want_genotypes

If TRUE (default), includes the matrix of random genotypes in the return list.

want_p_ind

If TRUE (NOT default), includes the matrix of individual-specific allele frequencies in the return list. Note that by default p_ind is not constructed in full at all, instead a fast low-memory algorithm constructs it in parts as needed only; beware that setting want_p_ind = TRUE increases memory usage in comparison.

want_p_subpops

If TRUE (NOT default), includes the matrix of random intermediate subpopulation allele frequencies in the return list.

want_p_anc

If TRUE (default), includes the vector of random ancestral allele frequencies in the return list.

verbose

If TRUE, prints messages for every stage in the algorithm.

require_polymorphic_loci

If TRUE (default), returned genotype matrix will not include any fixed loci (loci that happened to be fixed are drawn again, starting from their ancestral allele frequencies, and checked iteratively until no fixed loci remain, so that the final number of polymorphic loci is exactly m_loci).

beta

Shape parameter for a symmetric Beta for ancestral allele frequencies p_anc. If NA (default), p_anc is uniform with range in [0.01, 0.5]. Otherwise, p_anc has a symmetric Beta distribution with range in [0, 1].

p_anc

If provided, it is used as the ancestral allele frequencies (instead of drawing random ones). Must either be a scalar or a length-m_loci vector. If scalar and want_p_anc = TRUE, then the returned p_anc is the scalar value repeated m_loci times (it is always a vector).

Value

A named list with the following items (which may be missing depending on options):

  • X: An m-by-n matrix of genotypes. Included if want_genotypes = TRUE.

  • p_anc: A length-m vector of ancestral allele frequencies. Included if want_p_anc = TRUE.

  • p_subpops: An m-by-k matrix of intermediate subpopulation allele frequencies Included if want_p_subpops = TRUE.

  • p_ind: An m-by-n matrix of individual-specific allele frequencies. Included if want_p_ind = TRUE.

Examples

# dimensions
# number of loci
m_loci <- 10
# number of individuals
n_ind <- 5
# number of intermediate subpops
k_subpops <- 2

# define population structure
# FST values for k = 2 subpopulations
inbr_subpops <- c(0.1, 0.3)
# admixture proportions from 1D geography
admix_proportions <- admix_prop_1d_linear(n_ind, k_subpops, sigma = 1)

# draw all random allele freqs and genotypes
out <- draw_all_admix(admix_proportions, inbr_subpops, m_loci)

# return value is a list with these items:

# genotypes
X <- out$X

# ancestral AFs
p_anc <- out$p_anc

# # these are excluded by default, but would be included if ...
# # ... `want_p_subpops == TRUE`
# # intermediate subpopulation AFs
# p_subpops <- out$p_subpops
# 
# # ... `want_p_ind == TRUE`
# # individual-specific AFs
# p_ind <- out$p_ind

bnpsd

Simulate Genotypes from the BN-PSD Admixture Model

v1.2.3
GPL-3
Authors
Alejandro Ochoa [aut, cre] (<https://orcid.org/0000-0003-4928-3403>), John D. Storey [aut] (<https://orcid.org/0000-0001-5992-402X>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.