Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

form_tails

Transform tails of distribution


Description

Modify tail(s) of distribution defined by certain cutoff level using method of choice. This function is useful for doing robust analysis in presence of possible outliers.

Usage

form_tails(f, level, method = "trim", direction = "both")

Arguments

f

A pdqr-function.

level

Cutoff level. For direction "both" should be between 0 and 0.5; for "left" and "right" - between 0 and 1.

method

Modification method. One of "trim" or "winsor".

direction

Information about which tail(s) to modify. One of "both", "left", "right".

Details

Edges for left and right tails are computed as level and 1 - level quantiles respectively. The left tail is interval to the left of left edge, and right tail - to the right of right edge.

Method "trim" removes tail(s) while normalizing "center part". Method "winsor" "squashes" tails inside center of distribution in dirac-like fashion, i.e. probability of tail(s) is moved inside and becomes concentrated in 1e-8 neighborhood of nearest edge.

Direction "both" affect both tails. Directions "left" and "right" affect only left and right tail respectively.

Value

A pdqr-function with transformed tail(s).

See Also

form_resupport() for changing support to some known interval.

summ_center() and summ_spread() for computing summaries of distributions.

Examples

# Type "discrete"
my_dis <- new_d(data.frame(x = 1:4, prob = (1:4) / 10), type = "discrete")
meta_x_tbl(form_tails(my_dis, level = 0.1))
meta_x_tbl(
  form_tails(my_dis, level = 0.35, method = "winsor", direction = "left")
)

# Type "continuous"
d_norm <- as_d(dnorm)
plot(d_norm)
lines(form_tails(d_norm, level = 0.1), col = "blue")
lines(
  form_tails(d_norm, level = 0.1, method = "winsor", direction = "right"),
  col = "green"
)

# Use `form_resupport()` and `as_q()` to remove different levels from both
# directions. Here 0.1 level tail from left is removed, and 0.05 level from
# right
new_supp <- as_q(d_norm)(c(0.1, 1 - 0.05))
form_resupport(d_norm, support = new_supp)

# Examples of robust mean
set.seed(101)
x <- rcauchy(1000)
d_x <- new_d(x, "continuous")
summ_mean(d_x)
## Trimmed mean
summ_mean(form_tails(d_x, level = 0.1, method = "trim"))
## Winsorized mean
summ_mean(form_tails(d_x, level = 0.1, method = "winsor"))

pdqr

Work with Custom Distribution Functions

v0.3.0
MIT + file LICENSE
Authors
Evgeni Chasnovski [aut, cre] (<https://orcid.org/0000-0002-1617-4019>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.