Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

gff2fasta

Retrieving annotated sequences


Description

Retrieving from a genome the sequences specified in a gff.table.

Usage

gff2fasta(gff.table, genome)

Arguments

gff.table

A gff.table (tibble) with genomic features information.

genome

A fasta object (tibble) with the genome sequence(s).

Details

Each row in gff.table (see readGFF) describes a genomic feature in the genome, which is a tibble with columns Header and Sequence. The information in the columns Seqid, Start, End and Strand are used to retrieve the sequences from the Sequence column of genome. Every Seqid in the gff.table must match the first token in one of the Header texts, in order to retrieve from the correct Sequence.

Value

A fasta object with one row for each row in gff.table. The Header for each sequence is a summary of the information in the corresponding row of gff.table.

Author(s)

Lars Snipen and Kristian Hovde Liland.

See Also

Examples

# Using two files in this package
gff.file <- file.path(path.package("microseq"),"extdata","small.gff")
genome.file <- file.path(path.package("microseq"),"extdata","small.fna")

# Reading the genome first
genome <- readFasta(genome.file)

# Retrieving sequences
gff.table <- readGFF(gff.file)
fa.tbl <- gff2fasta(gff.table, genome)

# Alternative, using piping
readGFF(gff.file) %>% gff2fasta(genome) -> fa.tbl

microseq

Basic Biological Sequence Handling

v2.1.4
GPL-2
Authors
Lars Snipen, Kristian Hovde Liland
Initial release
2021-01-25

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.