Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

readMaxQuantFile

Read proteinGroups.txt files exported from MaxQuant


Description

Quantification results form MaxQuant can be read using this function and relevant information extracted. Innput files compressed as .gz can be read as well. Besides protein abundance values (XIC) peptide counting information like number of unique razor-peptides or PSM values can be extracted, too. The protein abundance values mat be normalized using multiple methods (median normalization is default), the determination of normalization values can be restricted to specific proteins (normalization to bait protein(s), or to matrix in UPS1 spike-in experiments). Besides, a graphical display of the distruibution of protein abundance values may be generated. The final output is a list containing these elements: $raw, $quant, $annot, $counts, $quantNotes, $notes, or (if separateAnnot=FALSE) data.frame with annotation- and main quantification-content.

Usage

readMaxQuantFile(
  path,
  fileName = "proteinGroups.txt",
  normalizeMeth = "median",
  quantCol = "LFQ.intensity",
  contamCol = "Potential.contaminant",
  pepCountCol = c("Razor...unique.peptides.", "MS.MS.count."),
  uniqPepPat = NULL,
  refLi = NULL,
  extrColNames = c("Majority.protein.IDs", "Fasta.headers", "Number.of.proteins"),
  specPref = c(conta = "conta|CON_|LYSC_CHICK", mainSpecies = "OS=Homo sapiens"),
  remRev = TRUE,
  separateAnnot = TRUE,
  tit = NULL,
  wex = 1.6,
  plotGraph = TRUE,
  silent = FALSE,
  callFrom = NULL
)

Arguments

path

(character) path of file to be read

fileName

(character) name of file to be read (default 'proteinGroups.txt' as typically generated by MaxQuant in txt folder). Gz-compressed files can be read, too.

normalizeMeth

(character) normalization method (for details see normalizeThis)

quantCol

(character or integer) exact col-names, or if length=1 content of quantCol will be used as pattern to search among column-names for $quant using grep

contamCol

(character or integer, length=1) which columns should be used for contaminants marked by ProteomeDiscoverer

pepCountCol

(character) pattern to search among column-names for count data of PSM and NoOfPeptides

uniqPepPat

(character, length=1) depreciated, please use pepCountCol instead

refLi

(character or integer) custom specify which line of data is main species, if character (eg 'mainSpe'), the column 'SpecType' in $annot will be searched for exact match of the (single) term given

extrColNames

(character) column names to be read (1: prefix for LFQ quantitation, default 'LFQ.intensity'; 2: column name for protein-IDs, default 'Majority.protein.IDs'; 3: column names of fasta-headers, default 'Fasta.headers', 4: column name for number of protein IDs matching, default 'Number.of.proteins')

specPref

(character) prefix to identifiers allowing to separate i) recognize contamination database, ii) species of main identifications and iii) spike-in species

remRev

(logical) option to remove all protein-identifications based on reverse-peptides

separateAnnot

(logical) if TRUE output will be organized as list with $annot, $abund for initial/raw abundance values and $quant with final normalized quantitations

tit

(character) custom title to plot

wex

(numeric) relative expansion factor of the violin in plot

plotGraph

(logical) optional plot vioplot of initial and normalized data (using normalizeMeth); alternatively the argument may contain numeric details that will be passed to layout when plotting

silent

(logical) suppress messages

callFrom

(character) allow easier tracking of message produced

Details

This function has been developed using MaxQuant versions 1.6.10.x to 1.6.17.x, the format of resulting file 'proteinGroups.txt' is typically well conserved.

Value

list with $raw (initial/raw abundance values), $quant with final normalized quantitations, $annot (columns ), $counts an array with 'PSM' and 'NoOfRazorPeptides', $quantNotes and $notes; or a data.frame with quantitation and annotation if separateAnnot=FALSE

See Also

Examples

path1 <- system.file("extdata", package="wrProteo")
# Here we'll load a short/trimmed example file (thus not MaxQuant default name) 
fiNa <- "proteinGroupsMaxQuant1.txt.gz"
specPr <- c(conta="conta|CON_|LYSC_CHICK", mainSpecies="YEAST",spike="HUMAN_UPS")
dataMQ <- readMaxQuantFile(path1, file=fiNa, specPref=specPr, tit="tiny MaxQuant")
summary(dataMQ$quant)
matrixNAinspect(dataMQ$quant, gr=gl(3,3))

wrProteo

Proteomics Data Analysis Functions

v1.4.1
GPL-3
Authors
Wolfgang Raffelsberger [aut, cre]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.