M3C: featurefilter – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

featurefilter

featurefilter: A function for filtering features

Description

This function is to filter features based on variance. Depending on the data different metrics will be more appropiate, simple variance is included if variance does not tend to increase with the mean. There is also the median absolute deviation which is a more robust metric than variance, this is preferable. The coefficient of variation (A) or its second order derivative (A2) (Kvalseth, 2017) are also included which standardise the standard deviation with respect to the mean. It is best to manually examine the mean-variance relationship of the data, for example, using the results from this function together with the qplot function from ggplot2.

Usage

featurefilter(mydata, percentile = 10, method = "MAD", topN = 20)

Arguments

`mydata`	Data frame: should have samples as columns and rows as features
`percentile`	Numerical value: the top X percent most variable features should be kept
`method`	Character vector: variance (var), coefficient of variation (A), second order A (A2), median absolute deviation (MAD)
`topN`	Numerical value: the number of most variable features to display

Value

A list, containing: 1) filtered data 2) statistics for each feature order according to the defined filtering metric

References

Kvålseth, Tarald O. "Coefficient of variation: the second-order alternative." Journal of Applied Statistics 44.3 (2017): 402-415.

Examples

filtered <- featurefilter(mydata,percentile=10)

M3C

Monte Carlo Reference-based Consensus Clustering

v1.12.0

AGPL-3

Authors

Christopher John, David Watson

Initial release