rliger: quantile_norm – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

quantile_norm

Quantile align (normalize) factor loadings

Description

This process builds a shared factor neighborhood graph to jointly cluster cells, then quantile normalizes corresponding clusters.

Usage

quantile_norm(object, ...)

## S3 method for class 'list'
quantile_norm(
  object,
  quantiles = 50,
  ref_dataset = NULL,
  min_cells = 20,
  knn_k = 20,
  dims.use = NULL,
  do.center = FALSE,
  max_sample = 1000,
  eps = 0.9,
  refine.knn = TRUE,
  rand.seed = 1,
  ...
)

## S3 method for class 'liger'
quantile_norm(
  object,
  quantiles = 50,
  ref_dataset = NULL,
  min_cells = 20,
  knn_k = 20,
  dims.use = NULL,
  do.center = FALSE,
  max_sample = 1000,
  eps = 0.9,
  refine.knn = TRUE,
  rand.seed = 1,
  ...
)

Arguments

`object`	`liger` object. Should run optimizeALS before calling.
`...`	Arguments passed to other methods
`quantiles`	Number of quantiles to use for quantile normalization (default 50).
`ref_dataset`	Name of dataset to use as a "reference" for normalization. By default, the dataset with the largest number of cells is used.
`min_cells`	Minimum number of cells to consider a cluster shared across datasets (default 20)
`knn_k`	Number of nearest neighbors for within-dataset knn graph (default 20).
`dims.use`	Indices of factors to use for shared nearest factor determination (default 1:ncol(H[[1]])).
`do.center`	Centers the data when scaling factors (useful for less sparse modalities like methylation data). (default FALSE)
`max_sample`	Maximum number of cells used for quantile normalization of each cluster and factor. (default 1000)
`eps`	The error bound of the nearest neighbor search. (default 0.9) Lower values give more accurate nearest neighbor graphs but take much longer to computer.
`refine.knn`	whether to increase robustness of cluster assignments using KNN graph.(default TRUE)
`rand.seed`	Random seed to allow reproducible results (default 1)

Details

The first step, building the shared factor neighborhood graph, is performed in SNF(), and produces a graph representation where edge weights between cells (across all datasets) correspond to their similarity in the shared factor neighborhood space. An important parameter here is knn_k, the number of neighbors used to build the shared factor space.

Next we perform quantile alignment for each dataset, factor, and cluster (by stretching/compressing datasets' quantiles to better match those of the reference dataset). These aligned factor loadings are combined into a single matrix and returned as H.norm.

Value

liger object with 'H.norm' and 'clusters' slot set.

Examples

## Not run: 
# ligerex (liger object), factorization complete
# do basic quantile alignment
ligerex <- quantile_norm(ligerex)
# higher resolution for more clusters (note that SNF is conserved)
ligerex <- quantile_norm(ligerex, resolution = 1.2)
# change knn_k for more fine-grained local clustering
ligerex <- quantile_norm(ligerex, knn_k = 15, resolution = 1.2)

## End(Not run)

rliger

Linked Inference of Genomic Experimental Relationships

v1.0.0

GPL-3

Authors

Joshua Welch [aut, ctb], Chao Gao [aut, ctb, cre], Jialin Liu [aut, ctb], Joshua Sodicoff [aut, ctb], Velina Kozareva [aut, ctb], Evan Macosko [aut, ctb], Paul Hoffman [ctb], Ilya Korsunsky [ctb], Robert Lee [ctb]

Initial release

2021-04-18