rliger: selectGenes – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

selectGenes

Select a subset of informative genes

Description

This function identifies highly variable genes from each dataset and combines these gene sets (either by union or intersection) for use in downstream analysis. Assuming that gene expression approximately follows a Poisson distribution, this function identifies genes with gene expression variance above a given variance threshold (relative to mean gene expression). It also provides a log plot of gene variance vs gene expression (with a line indicating expected expression across genes and cells). Selected genes are plotted in green.

Usage

selectGenes(
  object,
  var.thresh = 0.1,
  alpha.thresh = 0.99,
  num.genes = NULL,
  tol = 1e-04,
  datasets.use = 1:length(object@raw.data),
  combine = "union",
  capitalize = FALSE,
  do.plot = FALSE,
  cex.use = 0.3,
  chunk = 1000
)

Arguments

`object`	`liger` object. Should have already called normalize.
`var.thresh`	Variance threshold. Main threshold used to identify variable genes. Genes with expression variance greater than threshold (relative to mean) are selected. (higher threshold -> fewer selected genes). Accepts single value or vector with separate var.thresh for each dataset. (default 0.1)
`alpha.thresh`	Alpha threshold. Controls upper bound for expected mean gene expression (lower threshold -> higher upper bound). (default 0.99)
`num.genes`	Number of genes to find for each dataset. Optimises the value of var.thresh for each dataset to get this number of genes. Accepts single value or vector with same length as number of datasets (optional, default=NULL).
`tol`	Tolerance to use for optimization if num.genes values passed in (default 0.0001).
`datasets.use`	List of datasets to include for discovery of highly variable genes. (default 1:length(object@raw.data))
`combine`	How to combine variable genes across experiments. Either "union" or "intersection". (default "union")
`capitalize`	Capitalize gene names to match homologous genes (ie. across species) (default FALSE)
`do.plot`	Display log plot of gene variance vs. gene expression for each dataset. Selected genes are plotted in green. (default FALSE)
`cex.use`	Point size for plot.
`chunk`	size of chunks in hdf5 file. (default 1000)

Value

liger object with var.genes slot set.

Examples

## Not run: 
# Given datasets Y and Z
ligerex <- createLiger(list(y_set = Y, z_set = Z))
ligerex <- normalize(ligerex)
# use default selectGenes settings (var.thresh = 0.1)
ligerex <- selectGenes(ligerex)
# select a smaller subset of genes
ligerex <- selectGenes(ligerex, var.thresh = 0.3)

## End(Not run)

rliger

Linked Inference of Genomic Experimental Relationships

v1.0.0

GPL-3

Authors

Joshua Welch [aut, ctb], Chao Gao [aut, ctb, cre], Jialin Liu [aut, ctb], Joshua Sodicoff [aut, ctb], Velina Kozareva [aut, ctb], Evan Macosko [aut, ctb], Paul Hoffman [ctb], Ilya Korsunsky [ctb], Robert Lee [ctb]

Initial release

2021-04-18