Identify the largest subset of markers that are some distance apart
Identify the largest subset of markers for which no two adjacent markers are separated by less than some specified distance; if weights are provided, find the marker subset for which the sum of the weights is maximized.
pickMarkerSubset(locations, min.distance, weights)
locations |
A vector of marker locations. |
min.distance |
Minimum distance between adjacent markers in the chosen subset. |
weights |
(Optional) vector of weights for the markers. If
missing, we take |
Let d[i] be
the location of marker i, for i in 1,
…, M. We use the dynamic programming algorithm of Broman and Weber
(1999) to identify the subset of markers i[1], …,
i[k] for which d(i[j+1]) - d(i[j]) <=
min.distance
and sum w(i[j]) is maximized.
If there are multiple optimal subsets, we pick one at random.
A vector of marker names.
Karl W Broman, broman@wisc.edu
Broman, K. W. and Weber, J. L. (1999) Method for constructing confidently ordered linkage maps. Genet. Epidemiol., 16, 337–343.
data(hyper) # subset of markers on chr 4 spaced >= 5 cM pickMarkerSubset(pull.map(hyper)[[4]], 5) # no. missing genotypes at each chr 4 marker n.missing <- nmissing(subset(hyper, chr=4), what="mar") # weight by -log(prop'n missing), but don't let 0 missing go to +Inf wts <- -log( (n.missing+1) / (nind(hyper)+1) ) # subset of markers on chr 4 spaced >= 5 cM, with weights = -log(prop'n missing) pickMarkerSubset(pull.map(hyper)[[4]], 5, wts)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.