Summarize list of pdqr-functions with order
Functions for ordering the set of pdqr-functions supplied in a list. This might be useful for doing comparative statistical inference for several groups of data.
summ_order(f_list, method = "compare", decreasing = FALSE) summ_sort(f_list, method = "compare", decreasing = FALSE) summ_rank(f_list, method = "compare")
f_list |
List of pdqr-functions. |
method |
Method to be used for ordering. Should be one of "compare", "mean", "median", "mode", "midrange". |
decreasing |
If |
Ties for all methods are handled so as to preserve the original order.
Method "compare" is using the following ordering relation: pdqr-function f
is greater than g
if and only if P(f >= g) > 0.5
, or in code
summ_prob_true(f >= g) > 0.5
(see pdqr methods for "Ops" group generic family for more details on comparing pdqr-functions).
This method orders input based on this relation and order()
function. Notes:
This relation doesn't define strictly ordering because it is not
transitive: there can be pdqr-functions f
, g
, and h
, for which f
is
greater than g
, g
is greater than h
, and h
is greater than f
(but
should be otherwise). If not addressed, this might result into dependence of
output on order of the input. It is solved by first preordering f_list
based on method "mean" and then calling order()
.
Because comparing two pdqr-functions can be time consuming, this method
becomes rather slow as number of f_list
elements grows.
Methods "mean", "median", "mode", and "midrange" are based on
summ_center()
: ordering of f_list
is defined as ordering of corresponding
measures of distribution's center.
summ_order()
works essentially like order(). It
returns an integer vector representing a permutation which rearranges
f_list
in desired order.
summ_sort()
returns a sorted (in desired order) variant of f_list
.
summ_rank()
returns a numeric vector representing ranks of f_list
elements: 1 for the "smallest", length(f_list)
for the "biggest".
Other summary functions:
summ_center()
,
summ_classmetric()
,
summ_distance()
,
summ_entropy()
,
summ_hdr()
,
summ_interval()
,
summ_moment()
,
summ_prob_true()
,
summ_pval()
,
summ_quantile()
,
summ_roc()
,
summ_separation()
,
summ_spread()
d_fun <- as_d(dunif) f_list <- list(a = d_fun, b = d_fun + 1, c = d_fun - 1) summ_order(f_list) summ_sort(f_list) summ_rank(f_list) # All methods might give different results on some elaborated pdqr-functions # Methods "compare" and "mean" are not equivalent non_mean_list <- list( new_d(data.frame(x = c(0.56, 0.815), y = c(1, 1)), "continuous"), new_d(data.frame(x = 0:1, y = c(0, 1)), "continuous") ) summ_order(non_mean_list, method = "compare") summ_order(non_mean_list, method = "mean") # Methods powered by `summ_center()` are not equivalent m <- c(0, 0.2, 0.1) s <- c(1.1, 1.2, 1.3) dlnorm_list <- lapply(seq_along(m), function(i) { as_d(dlnorm, meanlog = m[i], sdlog = s[i]) }) summ_order(dlnorm_list, method = "mean") summ_order(dlnorm_list, method = "median") summ_order(dlnorm_list, method = "mode") # Method "compare" handles inherited non-transitivity. Here third element is # "greater" than second (`P(f >= g) > 0.5`), second - than first, and first # is "greater" than third. non_trans_list <- list( new_d(data.frame(x = c(0.39, 0.44, 0.46), y = c(17, 14, 0)), "continuous"), new_d(data.frame(x = c(0.05, 0.3, 0.70), y = c(4, 0, 4)), "continuous"), new_d(data.frame(x = c(0.03, 0.40, 0.80), y = c(1, 1, 1)), "continuous") ) summ_sort(non_trans_list) ## Output doesn't depend on initial order summ_sort(non_trans_list[c(2, 3, 1)])
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.