Fast (Grouped, Weighted) Mean for Matrix-Like Objects
fmean
is a generic function that computes the (column-wise) mean of x
, (optionally) grouped by g
and/or weighted by w
.
The TRA
argument can further be used to transform x
using its (grouped, weighted) mean.
fmean(x, ...) ## Default S3 method: fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE, use.g.names = TRUE, ...) ## S3 method for class 'matrix' fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE, use.g.names = TRUE, drop = TRUE, ...) ## S3 method for class 'data.frame' fmean(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE, use.g.names = TRUE, drop = TRUE, ...) ## S3 method for class 'grouped_df' fmean(x, w = NULL, TRA = NULL, na.rm = TRUE, use.g.names = FALSE, keep.group_vars = TRUE, keep.w = TRUE, ...)
x |
a numeric vector, matrix, data frame or grouped data frame (class 'grouped_df'). |
g |
a factor, |
w |
a numeric vector of (non-negative) weights, may contain missing values. |
TRA |
an integer or quoted operator indicating the transformation to perform:
1 - "replace_fill" | 2 - "replace" | 3 - "-" | 4 - "-+" | 5 - "/" | 6 - "%" | 7 - "+" | 8 - "*" | 9 - "%%" | 10 - "-%%". See |
na.rm |
logical. Skip missing values in |
use.g.names |
logical. Make group-names and add to the result as names (default method) or row-names (matrix and data frame methods). No row-names are generated for data.table's. |
drop |
matrix and data.frame method: Logical. |
keep.group_vars |
grouped_df method: Logical. |
keep.w |
grouped_df method: Logical. Retain summed weighting variable after computation (if contained in |
... |
arguments to be passed to or from other methods. |
Missing-value removal as controlled by the na.rm
argument is done very efficiently by simply skipping them in the computation (thus setting na.rm = FALSE
on data with no missing values doesn't give extra speed). Large performance gains can nevertheless be achieved in the presence of missing values if na.rm = FALSE
, since then the corresponding computation is terminated once a NA
is encountered and NA
is returned (unlike mean
which just runs through without any checks).
The weighted mean is computed as sum(x * w) / sum(w)
. If na.rm = TRUE
, missing values will be removed from both x
and w
i.e. utilizing only x[complete.cases(x,w)]
and w[complete.cases(x,w)]
.
This all seamlessly generalizes to grouped computations, which are performed in a single pass (without splitting the data) and therefore extremely fast.
When applied to data frames with groups or drop = FALSE
, fmean
preserves all column attributes (such as variable labels) but does not distinguish between classed and unclassed object (thus applying fmean
to a factor column will give a 'malformed factor' error). The attributes of the data frame itself are also preserved.
The (w
weighted) mean of x
, grouped by g
, or (if TRA
is used) x
transformed by its mean, grouped by g
.
## default vector method mpg <- mtcars$mpg fmean(mpg) # Simple mean fmean(mpg, w = mtcars$hp) # Weighted mean: Weighted by hp fmean(mpg, TRA = "-") # Simple transformation: demeaning (See also ?W) fmean(mpg, mtcars$cyl) # Grouped mean fmean(mpg, mtcars[8:9]) # another grouped mean. g <- GRP(mtcars[c(2,8:9)]) fmean(mpg, g) # Pre-computing groups speeds up the computation fmean(mpg, g, mtcars$hp) # Grouped weighted mean fmean(mpg, g, TRA = "-") # Demeaning by group fmean(mpg, g, mtcars$hp, "-") # Group-demeaning using weighted group means ## data.frame method fmean(mtcars) fmean(mtcars, g) fmean(fgroup_by(mtcars, cyl, vs, am)) # Another way of doing it.. head(fmean(mtcars, g, TRA = "-")) # etc.. ## matrix method m <- qM(mtcars) fmean(m) fmean(m, g) head(fmean(m, g, TRA = "-")) # etc.. ## method for grouped data frames - created with dplyr::group_by or fgroup_by library(dplyr) mtcars %>% group_by(cyl,vs,am) %>% fmean # Ordinary mtcars %>% group_by(cyl,vs,am) %>% fmean(hp) # Weighted mtcars %>% group_by(cyl,vs,am) %>% fmean(hp, "-") # Weighted Transform mtcars %>% group_by(cyl,vs,am) %>% select(mpg,hp) %>% fmean(hp, "-") # Only mpg mtcars %>% fgroup_by(cyl,vs,am) %>% # Equivalent and faster ! fselect(mpg,hp) %>% fmean(hp, "-")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.