Calculate differential abundance between conditions
Performs differential abundance calculations and statistical hypothesis tests on data frames with protein, peptide or precursor data. Different methods for statistical testing are available.
diff_abundance( data, sample, condition, grouping, intensity_log2, missingness, comparison, mean = NULL, sd = NULL, n_samples = NULL, ref_condition, filter_NA_missingness = TRUE, method = c("t-test", "t-test_mean_sd", "moderated_t-test", "proDA"), p_adj_method = "BH", retain_columns = NULL )
data |
A data frame containing at least the input variables that are required for the selected method. Ideally the output of |
sample |
The column in the data frame containing the sample name. Is not required if |
condition |
The column in the data frame containing the conditions. |
grouping |
The column in the data frame containing precursor or peptide identifiers. |
intensity_log2 |
The column in the data frame containing intensity values. The intensity values need to be log2 transformed. Is not required if |
missingness |
The column in the data frame containing missingness information. Can be obtained by calling |
comparison |
The column in the data frame containing comparison information of treatment/reference condition pairs. Can be obtained by
calling |
mean |
The column in the data frame containing mean values for two conditions. Is only required if |
sd |
The column in the data frame containing standard deviations for two conditions. Is only required if |
n_samples |
The column in the data frame containing the number of samples per condition for two conditions. Is only required if |
ref_condition |
The condition that is used as a reference for differential abundance calculation. |
filter_NA_missingness |
A logical, default is |
method |
A character vector, specifies the method used for statistical hypothesis testing. Methods include Welch test (" |
p_adj_method |
A character vector, specifies the p-value correction method. Possible methods are c("holm", "hochberg", "hommel", "bonferroni", "BH",
"BY", "fdr", "none"). Default method is |
retain_columns |
A vector indicating if certain columns should be retained from the input data frame. Default is not retaining
additional columns |
A data frame that contains differential abundances (diff
), p-values (pval
) and adjusted p-values (adj_pval
) for each protein,
peptide or precursor (depending on the grouping
variable) and the associated treatment/reference pair.
Depending on the method the data frame contains additional columns:
"t-test": The std_error
column contains the standard error of the differential abundances. n_obs
contains the number of
observations for the specific protein, peptide or precursor (depending on the grouping
variable) and the associated treatment/reference pair.
"t-test_mean_sd": mean_control
and mean_treated
columns contain the means for the reference and treatment condition, respectively.
sd_control
and sd_treated
columns contain the standard deviations for the reference and treatment condition, respectively.
n_control
and n_treated
columns contain the numbers of samples for the reference and treatment condition, respectively. The std_error
column contains the standard error of the differential abundances. t_statistic
contains the t_statistic for the t-test.
"moderated_t-test": CI_2.5
and CI_97.5
give the 2.5
contains average abundances for treatment/reference pairs (mean of the two group means). t_statistic
contains the t_statistic for the t-test. B
The
B-statistic is the log-odds that the protein, peptide or precursor (depending on grouping
) has a differential abundance between the two groups. Suppose B=1.5.
The odds of differential abundance is exp(1.5)=4.48, i.e, about four and a half to one. The probability that there is a differential abundance is 4.48/(1+4.48)=0.82,
i.e., the probability is about 82
abundant.n_obs
contains the number of observations for the specific protein, peptide or precursor (depending on the grouping
variable) and the
associated treatment/reference pair.
"proDA": The std_error
column contains the standard error of the differential abundances. avg_abundance
contains average abundances for
treatment/reference pairs (mean of the two group means). t_statistic
contains the t_statistic for the t-test. n_obs
contains the number of
observations for the specific protein, peptide or precursor (depending on the grouping
variable) and the associated treatment/reference pair.
## Not run: diff_abundance( data, sample = r_file_name, condition = r_condition, grouping = eg_precursor_id, intensity_log2 = normalised_intensity_log2, missingness = missingness, comparison = comparison, ref_condition = "control", method = "t-test", retain_columns = c(pg_protein_accessions) ) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.