sctransform: diff_mean_test – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

sctransform

diff_mean_test

Non-parametric differential expression test for sparse non-negative data

Description

Non-parametric differential expression test for sparse non-negative data

Usage

diff_mean_test(
  y,
  labels,
  R = 99,
  log2FC_th = log2(1.2),
  mean_th = 0.05,
  cells_th = 5,
  only_pos = FALSE,
  only_top_n = NULL,
  mean_type = "geometric",
  verbosity = 1
)

Arguments

`y`	A matrix of counts; must be (or inherit from) class dgCMatrix; genes are row, cells are columns
`labels`	A factor giving the group labels; must have exactly two levels
`R`	The number of random permutations used to derive the p-values; default is 99
`log2FC_th`	Threshold to remove genes from testing; absolute log2FC must be at least this large for a gene to be tested; default is `log2(1.2)`
`mean_th`	Threshold to remove genes from testing; gene mean must be at least this large for a gene to be tested; default is 0.05
`cells_th`	Threshold to remove genes from testing; gene must be detected (non-zero count) in at least this many cells in the group with higher mean; default is 5
`only_pos`	Test only genes with positive fold change (mean in group 1 > mean in group2); default is FALSE
`only_top_n`	Test only the this number of genes from both ends of the log2FC spectrum after all of the above filters have been applied; useful to get only the top markers; only used if set to a numeric value; default is NULL
`mean_type`	Which type of mean to use; if `'geometric'` (default) the geometric mean is used; to avoid `log(0)` we use `log1p` to add 1 to all counts and log-transform, calculate the arithmetic mean, and then back-transform and subtract 1 using `exp1m`; if this parameter is set to `'arithmetic'` the data is used as is
`verbosity`	Integer controlling how many messages the function prints; 0 is silent, 1 (default) is not

Value

Data frame of results

Details

This model-free test is applied to each gene (row) individually but is optimized to make use of the efficient sparse data representation of the input. A permutation null distribution us used to assess the significance of the observed difference in mean between two groups.

The observed difference in mean is compared against a distribution obtained by random shuffling of the group labels. For each gene every random permutation yields a difference in mean and from the population of these background differences we estimate a mean and standard deviation for the null distribution. This mean and standard deviation are used to turn the observed difference in mean into a z-score and then into a p-value. Finally, all p-values (for the tested genes) are adjusted using the Benjamini & Hochberg method (fdr). The log2FC values in the output are log2(mean1 / mean2). Empirical p-values are also calculated: emp_pval = (b + 1) / (R + 1) where b is the number of times the absolute difference in mean from a random permutation is at least as large as the absolute value of the observed difference in mean, R is the number of random permutations. This is an upper bound of the real empirical p-value that would be obtained by enumerating all possible group label permutations.

Examples

clustering <- 1:ncol(pbmc) %% 2
vst_out <- vst(pbmc, return_corrected_umi = TRUE)
de_res <- diff_mean_test(y = vst_out$umi_corrected, labels = clustering)

sctransform

Variance Stabilizing Transformations for Single Cell UMI Data

v0.3.2

GPL-3 | file LICENSE

Authors

Christoph Hafemeister [aut, cre] (<https://orcid.org/0000-0001-6365-8254>)

Initial release

diff_mean_test

Description

Usage

Arguments

Value

Details

Examples

sctransform

We don't support your browser anymore