BACON: Blocked Adaptive Computationally-Efficient Outlier Nominators
This function performs an outlier identification algorithm to the data in the x array [n x p] and y vector [n] following the lines described by Hadi et al. for their BACON outlier procedure.
mvBACON(x, collect = 4, m = min(collect * p, n * 0.5), alpha = 0.95,
init.sel = c("Mahalanobis", "dUniMedian", "random", "manual"),
man.sel, maxsteps = 100, allowSingular = FALSE, verbose = TRUE)x |
numeric matrix (of dimension [n x p]), not supposed to contain missing values. |
collect |
a multiplication factor c, when |
m |
integer in |
alpha |
significance level for the chisq cutoff, used to define the next iterations basic subset. |
init.sel |
character string, specifying the initial selection mode; implemented modes are:
|
man.sel |
only when |
maxsteps |
maximal number of iteration steps. |
allowSingular |
logical indicating a solution should be sought also when no matrix of rank p is found. |
verbose |
logical indicating if messages are printed which trace progress of the algorithm. |
a list with components
subset |
logical vector of length |
dis |
numeric vector of length |
cov |
p x p matrix, the corresponding robust estimate of covariance. |
Ueli Oetliker, Swiss Federal Statistical Office, for S-plus 5.1. Port to R, testing etc, by Martin Maechler
Billor, N., Hadi, A. S., and Velleman , P. F. (2000). BACON: Blocked Adaptive Computationally-Efficient Outlier Nominators; Computational Statistics and Data Analysis 34, 279–298. doi: 10.1016/S0167-9473(99)00101-2
require(robustbase) # for example data and covMcd():
## simple 2D example :
plot(starsCYG, main = "starsCYG data (n=47)")
B.st <- mvBACON(starsCYG)
points(starsCYG[ ! B.st$subset,], pch = 4, col = 2, cex = 1.5)
stopifnot(identical(which(!B.st$subset), c(7L,9L,11L,14L,20L,30L,34L)))
## finds the clear outliers (and 3 "borderline")
## 'coleman' from pkg 'robustbase'
coleman.x <- data.matrix(coleman[, 1:6])
Cc <- covMcd (coleman.x) # truly robust
summary(Cc) # -> 6 outliers (1,3,10,12,17,18)
Cb1 <- mvBACON(coleman.x) ##-> subset is all TRUE hmm??
Cb2 <- mvBACON(coleman.x, init.sel = "dUniMedian")
stopifnot(all.equal(Cb1, Cb2))
Cb.r <- lapply(1:20, function(i) { set.seed(i)
mvBACON(coleman.x, init.sel="random", verbose=FALSE) })
nm <- names(Cb.r[[1]]); nm <- nm[nm != "steps"]
all(eqC <- sapply(Cb.r[-1], function(CC) all.equal(CC[nm], Cb.r[[1]][nm]))) # TRUE
## --> BACON always breaks down, i.e., does not see the outliers here
## breaks down even when manually starting with all the non-outliers:
Cb.man <- mvBACON(coleman.x, init.sel = "manual",
man.sel = setdiff(1:20, c(1,3,10,12,17,18)))
which( ! Cb.man$subset) # the outliers according to mvBACON : _none_Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.