Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

featurefilter

featurefilter: A function for filtering features


Description

This function is to filter features based on variance. Depending on the data different metrics will be more appropiate, simple variance is included if variance does not tend to increase with the mean. There is also the median absolute deviation which is a more robust metric than variance, this is preferable. The coefficient of variation (A) or its second order derivative (A2) (Kvalseth, 2017) are also included which standardise the standard deviation with respect to the mean. It is best to manually examine the mean-variance relationship of the data, for example, using the results from this function together with the qplot function from ggplot2.

Usage

featurefilter(mydata, percentile = 10, method = "MAD", topN = 20)

Arguments

mydata

Data frame: should have samples as columns and rows as features

percentile

Numerical value: the top X percent most variable features should be kept

method

Character vector: variance (var), coefficient of variation (A), second order A (A2), median absolute deviation (MAD)

topN

Numerical value: the number of most variable features to display

Value

A list, containing: 1) filtered data 2) statistics for each feature order according to the defined filtering metric

References

Kvålseth, Tarald O. "Coefficient of variation: the second-order alternative." Journal of Applied Statistics 44.3 (2017): 402-415.

Examples

filtered <- featurefilter(mydata,percentile=10)

M3C

Monte Carlo Reference-based Consensus Clustering

v1.12.0
AGPL-3
Authors
Christopher John, David Watson
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.