mlr3: benchmark – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

benchmark

Benchmark Multiple Learners on Multiple Tasks

Description

Runs a benchmark on arbitrary combinations of tasks (Task), learners (Learner), and resampling strategies (Resampling), possibly in parallel.

Usage

benchmark(design, store_models = FALSE, store_backends = TRUE)

Arguments

`design`	(`data.frame()`) Data frame (or `data.table::data.table()`) with three columns: "task", "learner", and "resampling". Each row defines a resampling by providing a Task, Learner and an instantiated Resampling strategy. The helper function `benchmark_grid()` can assist in generating an exhaustive design (see examples) and instantiate the Resamplings per Task.
`store_models`	(`logical(1)`) Store the fitted model in the resulting BenchmarkResult? Set to `TRUE` if you want to further analyse the models or want to extract information like variable importance.
`store_backends`	(`logical(1)`) Keep the DataBackend of the Task in the BenchmarkResult? Set to `TRUE` if your performance measures require a Task, or to analyse results more conveniently. Set to `FALSE` to reduce the file size and memory footprint after serialization. The current default is `TRUE`, but this eventually will be changed in a future release.

Value

BenchmarkResult.

Parallelization

This function can be parallelized with the future package. One job is one resampling iteration, and all jobs are send to an apply function from future.apply in a single batch. To select a parallel backend, use future::plan().

Progress Bars

This function supports progress bars via the package progressr. Simply wrap the function in progressr::with_progress() to enable them. We recommend to use package progress as backend; enable with progressr::handlers("progress").

Logging

The mlr3 uses the lgr package for logging. lgr supports multiple log levels which can be queried with getOption("lgr.log_levels").

To suppress output and reduce verbosity, you can lower the log from the default level "info" to "warn":

lgr::get_logger("mlr3")$set_threshold("warn")

To get additional log output for debugging, increase the log level to "debug" or "trace":

lgr::get_logger("mlr3")$set_threshold("debug")

To log to a file or a data base, see the documentation of lgr::lgr-package.

Note

The fitted models are discarded after the predictions have been scored in order to reduce memory consumption. If you need access to the models for later analysis, set store_models to TRUE.

Examples

# benchmarking with benchmark_grid()
tasks = lapply(c("penguins", "sonar"), tsk)
learners = lapply(c("classif.featureless", "classif.rpart"), lrn)
resamplings = rsmp("cv", folds = 3)

design = benchmark_grid(tasks, learners, resamplings)
print(design)

set.seed(123)
bmr = benchmark(design)

## Data of all resamplings
head(as.data.table(bmr))

## Aggregated performance values
aggr = bmr$aggregate()
print(aggr)

## Extract predictions of first resampling result
rr = aggr$resample_result[[1]]
as.data.table(rr$prediction())

# Benchmarking with a custom design:
# - fit classif.featureless on penguins with a 3-fold CV
# - fit classif.rpart on sonar using a holdout
tasks = list(tsk("penguins"), tsk("sonar"))
learners = list(lrn("classif.featureless"), lrn("classif.rpart"))
resamplings = list(rsmp("cv", folds = 3), rsmp("holdout"))

design = data.table::data.table(
  task = tasks,
  learner = learners,
  resampling = resamplings
)

## Instantiate resamplings
design$resampling = Map(
  function(task, resampling) resampling$clone()$instantiate(task),
  task = design$task, resampling = design$resampling
)

## Run benchmark
bmr = benchmark(design)
print(bmr)

## Get the training set of the 2nd iteration of the featureless learner on penguins
rr = bmr$aggregate()[learner_id == "classif.featureless"]$resample_result[[1]]
rr$resampling$train_set(2)

mlr3

Machine Learning in R - Next Generation

v0.11.0

LGPL-3

Authors

Michel Lang [cre, aut] (<https://orcid.org/0000-0001-9754-0393>), Bernd Bischl [aut] (<https://orcid.org/0000-0001-6002-6980>), Jakob Richter [aut] (<https://orcid.org/0000-0003-4481-5554>), Patrick Schratz [aut] (<https://orcid.org/0000-0003-0748-6624>), Giuseppe Casalicchio [ctb] (<https://orcid.org/0000-0001-5324-5966>), Stefan Coors [ctb] (<https://orcid.org/0000-0002-7465-2146>), Quay Au [ctb] (<https://orcid.org/0000-0002-5252-8902>), Martin Binder [aut], Marc Becker [ctb] (<https://orcid.org/0000-0002-8115-0400>)

Initial release