microseq: readFastq – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

microseq

readFastq

Read and write FASTQ files

Description

Reads and writes files in the FASTQ format.

Usage

readFastq(in.file)
writeFastq(fdta, out.file)

Arguments

`in.file`	url/directory/name of (gzipped) FASTQ file to read.
`fdta`	FASTQ object to write.
`out.file`	url/directory/name of (gzipped) FASTQ file to write.

Details

These functions handle input/output of sequences in the commonly used FASTQ format, typically used for storing DNA sequences (reads) after sequencing. If filenames (in.file or out.file) have the extension .gz they will automatically be compressed/uncompressed.

The sequences are stored in a tibble, opening up all the possibilities in R for fast and easy manipulations. The content of the file is stored as three columns, Header, Sequence and Quality. If other columns are added, these will be ignored by writeFastq.

Value

readFastq returns a tibble with the contents of the (gzipped) FASTQ file stored in three columns of text. The first, named Header, contains the headerlines, the second, named Sequence, contains the sequences and the third, named Quality contains the base quality scores.

writeFastq produces a (gzipped) FASTQ file.

Note

These functions will only handle files where each entry spans one single line, i.e. not the (uncommon) multiline FASTQ format.

Author(s)

Lars Snipen and Kristian Hovde Liland.

Examples

## Not run: 
# We need a FASTQ-file to read, here is one example file:
fq.file <- file.path(file.path(path.package("microseq"),"extdata"),"small.fastq.gz")

# Read and write
fdta <- readFastq(fq.file)
ok <- writeFastq(fdta[1:3,], out.file = "delete_me.fq")

# Make use of dplyr to copy parts of the file to another file
readFastq(fq.file) %>% 
  mutate(Length = str_length(Sequence)) %>% 
  filter(Length > 200) %>% 
  writeFasta(out.file = "long_reads.fa") # writing to FASTA file

## End(Not run)

microseq

Basic Biological Sequence Handling

v2.1.4

GPL-2

Authors

Lars Snipen, Kristian Hovde Liland

Initial release

2021-01-25

readFastq

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

microseq

We don't support your browser anymore