Split modeling data into test and train set
Takes in data, fraction (for train set) and seed, and returns train and test set
splitdata(data, fraction, seed = NULL)
data |
a matrix, data.frame or data.table |
fraction |
proportion of observations that should go in the train set |
seed |
an integer value |
An essential task before doing modeling is to split the modeling data into
train and test sets. splitdata
is built for this task and returns a list
with train and test sets, which can be picked using the code given in example.
fraction
corresponds to the train dataset, while the rest of the
observations go to the test dataset. If the user wants to generate the same
test and train dataset everytime, he should specify a seed
value.
a list with two elements: train and test set
Akash Jain
# A 'data.frame' df <- data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), y = c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'), z = c(1, 1, 0, 0, 1, 0, 0, 1, 1, 0)) # Split data into train (70%) and test (30%) ltData <- splitdata(data = df, fraction = 0.7, seed = 123) trainData <- ltData$train testData <- ltData$test
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.