SSLR: SSLRRandomForest – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

SSLRRandomForest

General Interface Random Forest model

Description

Random Forest is a simple and effective semi-supervised learning method. It is the same as the traditional Random Forest algorithm, but the difference is that it use Semi supervised Decision Trees It can be used in classification or regression. If Y is numeric is for regression, classification in another case

Usage

SSLRRandomForest(
  mtry = NULL,
  trees = 500,
  min_n = NULL,
  w = 0.5,
  replace = TRUE,
  tree_max_depth = Inf,
  sampsize = NULL,
  min_samples_leaf = NULL,
  allowParallel = TRUE
)

Arguments

`mtry`	number of features in each decision tree. Default is null. This means that mtry = log(n_features) + 1
`trees`	number of trees. Default is 500
`min_n`	number of minimum samples in each tree Default is null. This means that uses all training data
`w`	weight parameter ranging from 0 to 1. Default is 0.5
`replace`	replacing type in sampling. Default is true
`tree_max_depth`	maximum tree depth. Default is Inf
`sampsize`	Size of sample. Default if (replace) nrow(x) else ceiling(.632*nrow(x))
`min_samples_leaf`	the minimum number of any terminal leaf node. Default is 1
`allowParallel`	Execute Random Forest in parallel if doParallel is loaded. Default is TRUE

Details

We can use paralleling processing with doParallel package and allowParallel = TRUE.

References

Jurica Levati, Michelangelo Ceci, Dragi Kocev, Saso Dzeroski.
Semi-supervised classification trees.
Published online: 25 March 2017 © Springer Science Business Media New York 2017

Examples

library(tidyverse)
library(caret)
library(SSLR)
library(tidymodels)

data(wine)

set.seed(1)
train.index <- createDataPartition(wine$Wine, p = .7, list = FALSE)
train <- wine[ train.index,]
test  <- wine[-train.index,]

cls <- which(colnames(wine) == "Wine")

#% LABELED
labeled.index <- createDataPartition(train$Wine, p = .2, list = FALSE)
train[-labeled.index,cls] <- NA


m <- SSLRRandomForest(trees = 5,  w = 0.3) %>% fit(Wine ~ ., data = train)

#Accuracy
predict(m,test) %>%
  bind_cols(test) %>%
  metrics(truth = "Wine", estimate = .pred_class)


#For probabilities
predict(m,test, type = "prob")

SSLR

Semi-Supervised Classification, Regression and Clustering Methods

v0.9.3.1

GPL-3

Authors

Francisco Jesús Palomares Alabarce [aut, cre] (<https://orcid.org/0000-0002-0499-7034>), José Manuel Benítez [ctb] (<https://orcid.org/0000-0002-2346-0793>), Isaac Triguero [ctb] (<https://orcid.org/0000-0002-0150-0651>), Christoph Bergmeir [ctb] (<https://orcid.org/0000-0002-3665-9021>), Mabel González [ctb] (<https://orcid.org/0000-0003-0152-444X>)

Initial release