Retrieving annotated sequences
Retrieving from a genome the sequences specified in a gff.table
.
gff2fasta(gff.table, genome)
gff.table |
A |
genome |
A fasta object ( |
Each row in gff.table
(see readGFF
) describes a genomic feature
in the genome
, which is a tibble
with columns Header and
Sequence. The information in the columns Seqid, Start, End and Strand are used to
retrieve the sequences from the Sequence column of genome
. Every Seqid in
the gff.table
must match the first token in one of the Header texts, in
order to retrieve from the correct Sequence.
A fasta object with one row for each row in gff.table
.
The Header
for each sequence is a summary of the information in the
corresponding row of gff.table
.
Lars Snipen and Kristian Hovde Liland.
# Using two files in this package gff.file <- file.path(path.package("microseq"),"extdata","small.gff") genome.file <- file.path(path.package("microseq"),"extdata","small.fna") # Reading the genome first genome <- readFasta(genome.file) # Retrieving sequences gff.table <- readGFF(gff.file) fa.tbl <- gff2fasta(gff.table, genome) # Alternative, using piping readGFF(gff.file) %>% gff2fasta(genome) -> fa.tbl
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.