PWM creating, matching, and related utilities
Position Weight Matrix (PWM) creating, matching, and related utilities for DNA data. (PWM for amino acid sequences are not supported.)
PWM(x, type = c("log2probratio", "prob"),
prior.params = c(A=0.25, C=0.25, G=0.25, T=0.25))
matchPWM(pwm, subject, min.score="80%", with.score=FALSE, ...)
countPWM(pwm, subject, min.score="80%", ...)
PWMscoreStartingAt(pwm, subject, starting.at=1)
## Utility functions for basic manipulation of the Position Weight Matrix
maxWeights(x)
minWeights(x)
maxScore(x)
minScore(x)
unitScale(x)
## S4 method for signature 'matrix'
reverseComplement(x, ...)x |
For For |
type |
The type of Position Weight Matrix, either "log2probratio" or "prob". See Details section for more information. |
prior.params |
A positive numeric vector, which represents the parameters of the Dirichlet conjugate prior, with names A, C, G, and T. See Details section for more information. |
pwm |
A Position Weight Matrix represented as a numeric matrix with row names A, C, G and T. |
subject |
Typically a DNAString object. A Views object on a DNAString subject, a MaskedDNAString object, or a single character string, are also supported. IUPAC ambiguity letters in |
min.score |
The minimum score for counting a match.
Can be given as a character string containing a percentage (e.g.
|
with.score |
|
starting.at |
An integer vector specifying the starting positions of the Position Weight Matrix relatively to the subject. |
... |
Additional arguments for methods. |
The PWM function uses a multinomial model with a Dirichlet conjugate
prior to calculate the estimated probability of base b at position i. As
mentioned in the Arguments section, prior.params supplies the
parameters for the DNA bases A, C, G, and T in the Dirichlet prior. These
values result in a position independent initial estimate of the probabilities
for the bases to be
priorProbs = prior.params/sum(prior.params) and the
posterior (data infused) estimate for the probabilities for the bases in each
of the positions to be
postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params)).
When type = "log2probratio", the PWM = unitScale(log2(postProbs/priorProbs)).
When type = "prob", the PWM = unitScale(postProbs).
A numeric matrix representing the Position Weight Matrix for PWM.
A numeric vector containing the Position Weight Matrix-based scores
for PWMscoreStartingAt.
An XStringViews object for matchPWM.
A single integer for countPWM.
A vector containing the max weight for each position in pwm
for maxWeights.
A vector containing the min weight for each position in pwm
for minWeights.
The highest possible score for a given Position Weight Matrix for
maxScore.
The lowest possible score for a given Position Weight Matrix for
minScore.
The modified numeric matrix given by
(x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x)) for
unitScale.
A PWM obtained by reverting the column order in PWM x and by
reassigning each row to its complementary nucleotide
for reverseComplement.
H. Pagès and P. Aboyoun
Wasserman, WW, Sandelin, A., (2004) Applied bioinformatics for the identification of regulatory elements, Nat Rev Genet., 5(4):276-87.
## Data setup: data(HNF4alpha) library(BSgenome.Dmelanogaster.UCSC.dm3) chr3R <- Dmelanogaster$chr3R chr3R ## Create a PWM from a PFM or directly from a rectangular ## DNAStringSet object: pfm <- consensusMatrix(HNF4alpha) pwm <- PWM(pfm) # same as 'PWM(HNF4alpha)' ## Perform some general routines on the PWM: round(pwm, 2) maxWeights(pwm) maxScore(pwm) reverseComplement(pwm) ## Score the first 5 positions: PWMscoreStartingAt(pwm, chr3R, starting.at=1:5) ## Match the plus strand: hits <- matchPWM(pwm, chr3R) nhit <- countPWM(pwm, chr3R) # same as 'length(hits)' ## Use 'with.score=TRUE' to get the scores of the hits: hits <- matchPWM(pwm, chr3R, with.score=TRUE) head(mcols(hits)$score) min(mcols(hits)$score / maxScore(pwm)) # should be >= 0.8 ## The scores can also easily be post-calculated: scores <- PWMscoreStartingAt(pwm, subject(hits), start(hits)) ## Match the minus strand: matchPWM(reverseComplement(pwm), chr3R)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.