(Updated) quality assessment reports on short reads
This page summarizes an updated approach to quality assessment reports
in ShortRead
.
## Input source for short reads QAFastqSource(con = character(), n = 1e+06, readerBlockSize = 1e+08, flagNSequencesRange = NA_integer_, ..., html = system.file("template", "QASources.html", package="ShortRead")) QAData(seq = ShortReadQ(), filter = logical(length(seq)), ...) ## Possible QA elements QAFrequentSequence(useFilter = TRUE, addFilter = TRUE, n = NA_integer_, a = NA_integer_, flagK=.8, reportSequences = FALSE, ...) QANucleotideByCycle(useFilter = TRUE, addFilter = TRUE, ...) QANucleotideUse(useFilter = TRUE, addFilter = TRUE, ...) QAQualityByCycle(useFilter = TRUE, addFilter = TRUE, ...) QAQualityUse(useFilter = TRUE, addFilter = TRUE, ...) QAReadQuality(useFilter = TRUE, addFilter = TRUE, flagK = 0.2, flagA = 30L, ...) QASequenceUse(useFilter = TRUE, addFilter = TRUE, ...) QAAdapterContamination(useFilter = TRUE, addFilter = TRUE, Lpattern = NA_character_, Rpattern = NA_character_, max.Lmismatch = 0.1, max.Rmismatch = 0.2, min.trim = 9L, ...) ## Order QA report elements QACollate(src, ...) ## perform analysis qa2(object, state, ..., verbose=FALSE) ## Outputs from qa2 QA(src, filtered, flagged, ...) QAFiltered(useFilter = TRUE, addFilter = TRUE, ...) QAFlagged(useFilter = TRUE, addFilter = TRUE, ...) ## Summarize results as html report ## S4 method for signature 'QA' report(x, ..., dest = tempfile(), type = "html") ## additional methods; 'flag' is not fully implemented flag(object, ..., verbose=FALSE) ## S4 method for signature 'QASummary' rbind(..., deparse.level = 1)
con |
|
n |
|
readerBlockSize |
integer(1) number of bytes to input, as used by
|
flagNSequencesRange |
|
html |
|
seq |
|
filter |
|
useFilter, addFilter |
|
a |
|
flagK, flagA |
|
reportSequences |
|
Lpattern, Rpattern, max.Lmismatch, max.Rmismatch,
min.trim |
Parameters influencing adapter identification, see
|
src |
The source, e.g., |
object |
An instance of class derived from |
.
state |
The data on which quality assessment will be performed; this is not usually necessary for end-users. |
verbose |
|
filtered, flagged |
Primarily for internal use, instances of
|
x |
An instance of |
dest |
|
type |
|
deparse.level |
see |
... |
Additional arguments, e.g., |
Use QACollate
to specify an order in which components of a QA
report are to be assembled. The first argument is the data source
(e.g., QAFastqSource
).
Functions related to data input include:
QAFastqSource
defines the location of fastq files to
be included in the report. con
is used to construct a
FastqSampler
instance, and records are processed
using qa2,QAFastqSource-method
.
QAData
is a class for representing the data during the QA report generation pass; it is primarily for internal use.
Possible elements in a QA report are:
QAFrequentSequence
identifies the most-commonly
occuring sequences. One of n
or a
can be non-NA, and
determine the number of frequent sequences reported. n
specifies the number of most-frequent sequences to filter, e.g.,
n=10
would filter the top 10 most commonly occurring
sequences; a
provides a threshold frequency (count) above
which reads are filtered. The sample is flagged when a fraction
flagK
of the reads are filtered.
reportSequences
determines whether the most commonly
occuring sequences, as determined by n
or a
, are
printed in the html report.
QANucleotideByCycle
reports nucleotide frequency as a function of cycle.
QAQualityByCycle
reports average quality score as a function of cycle.
QAQualityUse
summarizes overall nucleotide qualities.
QAReadQuality
summarizes the distribution of read qualities.
QASequenceUse
summarizes the cumulative distribution of reads occurring 1, 2, ... times.
QAAdapterContamination
reports the occurrence of ‘adapter’ sequences on the left and / or right end of each read.
Martin Morgan <mtmorgan@fhcrc.org>
QA
.
dirPath <- system.file(package="ShortRead", "extdata", "E-MTAB-1147") fls <- dir(dirPath, "fastq.gz", full=TRUE) coll <- QACollate(QAFastqSource(fls), QAReadQuality(), QAAdapterContamination(), QANucleotideUse(), QAQualityUse(), QASequenceUse(), QAFrequentSequence(n=10), QANucleotideByCycle(), QAQualityByCycle()) x <- qa2(coll, BPPARAM=SerialParam(), verbose=TRUE) res <- report(x) if (interactive()) browseURL(res)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.