wrProteo: readMaxQuantFile – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

readMaxQuantFile

Read proteinGroups.txt files exported from MaxQuant

Description

Quantification results form MaxQuant can be read using this function and relevant information extracted. Innput files compressed as .gz can be read as well. Besides protein abundance values (XIC) peptide counting information like number of unique razor-peptides or PSM values can be extracted, too. The protein abundance values mat be normalized using multiple methods (median normalization is default), the determination of normalization values can be restricted to specific proteins (normalization to bait protein(s), or to matrix in UPS1 spike-in experiments). Besides, a graphical display of the distruibution of protein abundance values may be generated. The final output is a list containing these elements: $raw, $quant, $annot, $counts, $quantNotes, $notes, or (if separateAnnot=FALSE) data.frame with annotation- and main quantification-content.

Usage

readMaxQuantFile(
  path,
  fileName = "proteinGroups.txt",
  normalizeMeth = "median",
  quantCol = "LFQ.intensity",
  contamCol = "Potential.contaminant",
  pepCountCol = c("Razor...unique.peptides.", "MS.MS.count."),
  uniqPepPat = NULL,
  refLi = NULL,
  extrColNames = c("Majority.protein.IDs", "Fasta.headers", "Number.of.proteins"),
  specPref = c(conta = "conta|CON_|LYSC_CHICK", mainSpecies = "OS=Homo sapiens"),
  remRev = TRUE,
  separateAnnot = TRUE,
  tit = NULL,
  wex = 1.6,
  plotGraph = TRUE,
  silent = FALSE,
  callFrom = NULL
)

Arguments

`path`	(character) path of file to be read
`fileName`	(character) name of file to be read (default 'proteinGroups.txt' as typically generated by MaxQuant in txt folder). Gz-compressed files can be read, too.
`normalizeMeth`	(character) normalization method (for details see `normalizeThis`)
`quantCol`	(character or integer) exact col-names, or if length=1 content of `quantCol` will be used as pattern to search among column-names for $quant using `grep`
`contamCol`	(character or integer, length=1) which columns should be used for contaminants marked by ProteomeDiscoverer
`pepCountCol`	(character) pattern to search among column-names for count data of PSM and NoOfPeptides
`uniqPepPat`	(character, length=1) depreciated, please use `pepCountCol` instead
`refLi`	(character or integer) custom specify which line of data is main species, if character (eg 'mainSpe'), the column 'SpecType' in $annot will be searched for exact match of the (single) term given
`extrColNames`	(character) column names to be read (1: prefix for LFQ quantitation, default 'LFQ.intensity'; 2: column name for protein-IDs, default 'Majority.protein.IDs'; 3: column names of fasta-headers, default 'Fasta.headers', 4: column name for number of protein IDs matching, default 'Number.of.proteins')
`specPref`	(character) prefix to identifiers allowing to separate i) recognize contamination database, ii) species of main identifications and iii) spike-in species
`remRev`	(logical) option to remove all protein-identifications based on reverse-peptides
`separateAnnot`	(logical) if `TRUE` output will be organized as list with `$annot`, `$abund` for initial/raw abundance values and `$quant` with final normalized quantitations
`tit`	(character) custom title to plot
`wex`	(numeric) relative expansion factor of the violin in plot
`plotGraph`	(logical) optional plot vioplot of initial and normalized data (using `normalizeMeth`); alternatively the argument may contain numeric details that will be passed to `layout` when plotting
`silent`	(logical) suppress messages
`callFrom`	(character) allow easier tracking of message produced

Details

This function has been developed using MaxQuant versions 1.6.10.x to 1.6.17.x, the format of resulting file 'proteinGroups.txt' is typically well conserved.

Value

list with $raw (initial/raw abundance values), $quant with final normalized quantitations, $annot (columns ), $counts an array with 'PSM' and 'NoOfRazorPeptides', $quantNotes and $notes; or a data.frame with quantitation and annotation if separateAnnot=FALSE

Examples

path1 <- system.file("extdata", package="wrProteo")
# Here we'll load a short/trimmed example file (thus not MaxQuant default name) 
fiNa <- "proteinGroupsMaxQuant1.txt.gz"
specPr <- c(conta="conta|CON_|LYSC_CHICK", mainSpecies="YEAST",spike="HUMAN_UPS")
dataMQ <- readMaxQuantFile(path1, file=fiNa, specPref=specPr, tit="tiny MaxQuant")
summary(dataMQ$quant)
matrixNAinspect(dataMQ$quant, gr=gl(3,3))

wrProteo

Proteomics Data Analysis Functions

v1.4.1

GPL-3

Authors

Wolfgang Raffelsberger [aut, cre]

Initial release