Age Priors
Estimate probability ratios P(R|A) / P(R) for age differences A and five categories of parent-offspring and sibling relationships R.
MakeAgePrior( Pedigree = NULL, LifeHistData = NULL, MaxAgeParent = NULL, Discrete = NULL, Flatten = NULL, lambdaNW = -log(0.5)/100, Smooth = TRUE, Plot = TRUE, Return = "LR", quiet = FALSE )
Pedigree |
dataframe with id - dam - sire in columns 1-3, and optional column with birth years. Other columns are ignored. |
LifeHistData |
dataframe with 3 or 5 columns: id - sex (not used) - birth year (- BY.min - BY.max), with unknown birth years coded as negative numbers or NA. Column names are ignored, so the column order is important. "Birth year" may be in any arbitrary discrete time unit relevant to the species (day, month, decade), as long as parents are never born in the same time unit as their offspring. It may include individuals not in the pedigree, and not all individuals in the pedigree need to be in LifeHistData. |
MaxAgeParent |
maximum age of a parent, a single number (max across dams and sires) or a vector of length two (dams, sires). If NULL, it will be estimated from the pedigree. See details below. |
Discrete |
discrete generations? By default (NULL), discrete
generations are assumed if all parent-offspring pairs have an age
difference of 1, and all siblings an age difference of 0, and there are at
least 20 pairs of each category (mother, father, maternal sibling, paternal
sibling). Otherwise, overlapping generations are presumed. When
|
Flatten |
logical. To deal with small sample sizes for some or all
relationships, calculate weighed average between the observed age
difference distribution among relatives and a flat (0/1) distribution. When
|
lambdaNW |
control weighing factors when |
Smooth |
smooth the tails of and any dips in the distribution? Sets dips
(<10% of average of neighbouring ages) to the average of the neighbouring
ages, sets the age after the end (oldest observed age) to LR(end)/2, and
assigns a small value (0.001) to the ages before the front (youngest
observed age) and after the new end. Peaks are not smoothed out, as these
are less likely to cause problems than dips, and are more likely to be
genuine characteristics of the species. Is set to |
Plot |
plot a heatmap of the results? |
Return |
return only a matrix with the likelihood-ratio P(A|R) /
P(A) ( |
quiet |
suppress messages. |
α_{A,R} is the ratio between the observed counts of pairs with age difference A and relationship R (N_{A,R}), and the expected counts if age and relationship were independent (N_{.,.}*p_A*p_R).
During pedigree reconstruction, α_{A,R} are multiplied by the genetic-only P(R|G) to obtain a probability that the pair are relatives of type R conditional on both their age difference and their genotypes.
The age-difference prior is used for pairs of genotyped individuals, as well as for dummy individuals. This assumes that the propensity for a pair with a given age difference to both be sampled does not depend on their relationship, so that the ratio P(A|R) / P(A) does not differ between sampled and unsampled pairs.
For further details, see the vignette.
A matrix with the probability ratio of the age difference between two individuals conditional on them being a certain type of relative (P(A|R)) versus being a random draw from the sample (P(A)). Assuming conditional independence, this equals the probability ratio of being a certain type of relative conditional on the age difference, versus being a random draw.
The matrix has one row per age difference (0 - nAgeClasses) and five columns, one for each relationship type, with abbreviations:
M |
Mothers |
P |
Fathers |
FS |
Full siblings |
MS |
Maternal half-siblings |
PS |
Paternal half-siblings |
When Return
='all', a list is returned with the following elements:
BirthYearRange |
vector length 2 |
MaxAgeParent |
vector length 2, see details |
tblA.R |
matrix with the counts per age difference (rows) / relationship (columns) combination, plus a column 'X' with age differences across all pairs of individuals |
PA.R |
Proportions, i.e. |
LR.RU.A.raw |
Proportions |
Weights |
vector length 4, the weights used to flatten the distributions |
LR.RU.A |
the ageprior, flattend and/or smoothed |
Specs.AP |
the names of the input |
The small sample correction with Smooth
and/or Flatten
prevents errors in one dataset, but may introduce errors in another; a
single solution that fits to the wide variety of life histories and
datasets is impossible. Please do inspect the matrix, e.g. with
PlotAgePrior
, and adjust the input parameters and/or the output
matrix as necessary.
When all individuals in LifeHistData
have the same birth year, it is
assumed that Discrete=TRUE
and MaxAgeParent=1
. Consequently,
it is assumed there are no avuncular pairs present in the sample; cousins
are considered as alternative. To enforce overlapping generations, and
thereby the consideration of full- and half- avuncular relationships, set
MaxAgeParent
to some value greater than 1.
When no birth year information is given at all, a single cohort is assumed, and the same rules apply.
"Birth year" may be in any arbitrary time unit relevant to the species (day, month, decade), as long as parents are always born before their putative offspring, and never in the same time unit (e.g. parent's BirthYear= 1 (or 2001) and offspring BirthYear=5 (or 2005)). Negative numbers and NA's are interpreted as unknown, and fractional numbers are not allowed.
The maximum parental age for each sex equals the maximum of:
the maximum age of parents in Pedigree
,
the input parameter MaxAgeParent
,
the maximum range of birth years in LifeHistData
(including
BY.min and BY.max). Only used if both of the previous are NA
, or
if there are fewer than 20 parents of either sex assigned.
1, if Discrete=TRUE
or the previous three are all NA
If the age distribution of assigned parents does not capture the maximum
possible age of parents, it is advised to specify MaxAgeParent
for
one or both sexes. Not doing so may hinder subsequent assignment of both
dummy parents and grandparents.
@section grandparents & avuncular
The agepriors for grand-parental and avuncular pairs is calculated from
these by sequoia
, and included in its output as
'AgePriorExtra'.
sequoia
and its argument args.AP
,
PlotAgePrior
for visualisation. The age vignette gives
further details, mathematical justification, and some examples.
# without pedigree or lifehistdata: MakeAgePrior() MakeAgePrior(MaxAgeParent = c(2,3)) MakeAgePrior(Discrete=TRUE) # single cohort: MakeAgePrior(LifeHistData = data.frame(ID = letters[1:5], Sex=3, BirthYear=1984)) # overlapping generations: data(Ped_griffin, SeqOUT_griffin, package="sequoia") # without pedigree: MaxAgeParent = max age difference between any pair +1 MakeAgePrior(LifeHistData = SeqOUT_griffin$LifeHist) # with pedigree: MakeAgePrior(Pedigree=Ped_griffin, LifeHistData=SeqOUT_griffin$LifeHist, Smooth=FALSE, Flatten=FALSE) # with small-sample correction: MakeAgePrior(Pedigree=Ped_griffin, LifeHistData=SeqOUT_griffin$LifeHist, Smooth=TRUE, Flatten=TRUE)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.