wrProteo: readFasta2 – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

readFasta2

Read file of protein sequences in fasta format Read fasta formatted file (from UniProt) to extract (protein) sequences and name. If tableOut=TRUE output may be organized as matrix for separating meta-annotation (eg GeneName, OrganismName, ProteinName) in separate columns.

Description

Read file of protein sequences in fasta format

Read fasta formatted file (from UniProt) to extract (protein) sequences and name. If tableOut=TRUE output may be organized as matrix for separating meta-annotation (eg GeneName, OrganismName, ProteinName) in separate columns.

Usage

readFasta2(
  filename,
  delim = "|",
  databaseSign = c("sp", "tr", "generic", "gi"),
  tableOut = FALSE,
  UniprSep = c("OS=", "OX=", "GN=", "PE=", "SV="),
  cleanCols = TRUE,
  silent = FALSE,
  callFrom = NULL,
  debug = FALSE
)

Arguments

`filename`	(character) names fasta-file to be read
`delim`	(character) delimeter at header-line
`databaseSign`	(character) characters at beginning right afetr the '>' (typically specifying the data-base-origin), they will be excluded from the sequance-header
`tableOut`	(logical) toggle to return named character-vector or matrix with enhaced parsing of fasta-header. The resulting matrix will contain the comumns 'database','uniqueIdentifier','entryName','proteinName','sequence' and further columns depending on argument `UniprSep`
`UniprSep`	(character) separators for further separating entry-fields if `tableOut=TRUE`, see also UniProt-FASTA-headers
`cleanCols`	(logical) remove columns with all entries NA, if `tableOut=TRUE`
`silent`	(logical) suppress messages
`callFrom`	(character) allows easier tracking of message(s) produced
`debug`	(logical) supplemental messages for debugging

Value

return (based on 'tableOut') simple character vector (of sequence) with Uniprot ID as name or matrix with columns: 'database','uniqueIdentifier','entryName','proteinName','sequence' and further columns depending on argument UniprSep

Examples

# tiny example with common contaminants 
path1 <- system.file('extdata',package='wrProteo')
fiNa <-  "conta1.fasta"
fasta1 <- readFasta2(file.path(path1,fiNa))
## now let's read and further separate annotation-fields
fasta2 <- readFasta2(file.path(path1,fiNa),tableOut=TRUE)
str(fasta1)

wrProteo

Proteomics Data Analysis Functions

v1.4.1

GPL-3

Authors

Wolfgang Raffelsberger [aut, cre]

Initial release