microseq: readFasta – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

microseq

readFasta

Read and write FASTA files

Description

Reads and writes biological sequences (DNA, RNA, protein) in the FASTA format.

Usage

readFasta(in.file)
writeFasta(fdta, out.file, width = 0)

Arguments

`in.file`	url/directory/name of (gzipped) FASTA file to read.
`fdta`	A `tibble` with sequence data, see ‘Details’ below.
`out.file`	Name of (gzipped) FASTA file to create.
`width`	Number of characters per line, or 0 for no linebreaks.

Details

These functions handle input/output of sequences in the commonly used FASTA format. For every sequence it is presumed there is one Header-line starting with a ‘>’. If filenames (in.file or out.file) have the extension .gz they will automatically be compressed/uncompressed.

The sequences are stored in a tibble, opening up all the possibilities in R for fast and easy manipulations. The content of the file is stored as two columns, Header and Sequence. If other columns are added, these will be ignored by writeFasta.

The default width = 0 in writeFasta results in no linebreaks in the sequences (one sequence per line).

Value

readFasta returns a tibble with the contents of the (gzipped) FASTA file stored in two columns of text. The first, named Header, contains the headerlines and the second, named Sequence, contains the sequences.

writeFasta produces a (gzipped) FASTA file.

Author(s)

Lars Snipen and Kristian Hovde Liland.

Examples

## Not run: 
# We need a FASTA-file to read, here is one example file:
fa.file <- file.path(file.path(path.package("microseq"),"extdata"),"small.ffn")

# Read and write
fdta <- readFasta(fa.file)
ok <- writeFasta(fdta[4:5,], out.file = "delete_me.fasta")

# Make use of dplyr to copy parts of the file to another file
readFasta(fa.file) %>% 
  filter(str_detect(Sequence, "TGA$")) %>% 
  writeFasta(out.file = "TGAstop.fasta", width = 80) -> ok

## End(Not run)

microseq

Basic Biological Sequence Handling

v2.1.4

GPL-2

Authors

Lars Snipen, Kristian Hovde Liland

Initial release

2021-01-25

readFasta

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

microseq

We don't support your browser anymore