tabnet: tabnet_config – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

tabnet

tabnet_config

Configuration for TabNet models

Description

Configuration for TabNet models

Usage

tabnet_config(
  batch_size = 256,
  penalty = 0.001,
  clip_value = NULL,
  loss = "auto",
  epochs = 5,
  drop_last = FALSE,
  decision_width = NULL,
  attention_width = NULL,
  num_steps = 3,
  feature_reusage = 1.3,
  virtual_batch_size = 128,
  valid_split = 0,
  learn_rate = 0.02,
  optimizer = "adam",
  lr_scheduler = NULL,
  lr_decay = 0.1,
  step_size = 30,
  checkpoint_epochs = 10,
  cat_emb_dim = 1,
  num_independent = 2,
  num_shared = 2,
  momentum = 0.02,
  verbose = FALSE,
  device = "auto"
)

Arguments

`batch_size`	(int) Number of examples per batch, large batch sizes are recommended. (default: 1024)
`penalty`	This is the extra sparsity loss coefficient as proposed in the original paper. The bigger this coefficient is, the sparser your model will be in terms of feature selection. Depending on the difficulty of your problem, reducing this value could help.
`clip_value`	If a float is given this will clip the gradient at clip_value. Pass `NULL` to not clip.
`loss`	(character or function) Loss function for training (default to mse for regression and cross entropy for classification)
`epochs`	(int) Number of training epochs.
`drop_last`	(bool) Whether to drop last batch if not complete during training
`decision_width`	(int) Width of the decision prediction layer. Bigger values gives more capacity to the model with the risk of overfitting. Values typically range from 8 to 64.
`attention_width`	(int) Width of the attention embedding for each mask. According to the paper n_d=n_a is usually a good choice. (default=8)
`num_steps`	(int) Number of steps in the architecture (usually between 3 and 10)
`feature_reusage`	(float) This is the coefficient for feature reusage in the masks. A value close to 1 will make mask selection least correlated between layers. Values range from 1.0 to 2.0.
`virtual_batch_size`	(int) Size of the mini batches used for "Ghost Batch Normalization" (default=128)
`valid_split`	(float) The fraction of the dataset used for validation.
`learn_rate`	initial learning rate for the optimizer.
`optimizer`	the optimization method. currently only 'adam' is supported, you can also pass any torch optimizer function.
`lr_scheduler`	if `NULL`, no learning rate decay is used. if "step" decays the learning rate by `lr_decay` every `step_size` epochs. It can also be a torch::lr_scheduler function that only takes the optimizer as parameter. The `step` method is called once per epoch.
`lr_decay`	multiplies the initial learning rate by `lr_decay` every `step_size` epochs. Unused if `lr_scheduler` is a `torch::lr_scheduler` or `NULL`.
`step_size`	the learning rate scheduler step size. Unused if `lr_scheduler` is a `torch::lr_scheduler` or `NULL`.
`checkpoint_epochs`	checkpoint model weights and architecture every `checkpoint_epochs`. (default is 10). This may cause large memory usage. Use `0` to disable checkpoints.
`cat_emb_dim`	Embedding size for categorial features (default=1)
`num_independent`	Number of independent Gated Linear Units layers at each step. Usual values range from 1 to 5.
`num_shared`	Number of shared Gated Linear Units at each step Usual values range from 1 to 5
`momentum`	Momentum for batch normalization, typically ranges from 0.01 to 0.4 (default=0.02)
`verbose`	(bool) wether to print progress and loss values during training.
`device`	the device to use for training. "cpu" or "cuda". The default ("auto") uses to "cuda" if it's available, otherwise uses "cpu".

Value

A named list with all hyperparameters of the TabNet implementation.

tabnet

Fit 'TabNet' Models for Classification and Regression

v0.1.0

MIT + file LICENSE

Authors

Daniel Falbel [aut, cre], RStudio [cph]

Initial release

tabnet_config

Description

Usage

Arguments

Value

tabnet

We don't support your browser anymore