StatMeasures: splitdata – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

StatMeasures

splitdata

Split modeling data into test and train set

Description

Takes in data, fraction (for train set) and seed, and returns train and test set

Usage

splitdata(data, fraction, seed = NULL)

Arguments

`data`	a matrix, data.frame or data.table
`fraction`	proportion of observations that should go in the train set
`seed`	an integer value

Details

An essential task before doing modeling is to split the modeling data into train and test sets. splitdata is built for this task and returns a list with train and test sets, which can be picked using the code given in example.

fraction corresponds to the train dataset, while the rest of the observations go to the test dataset. If the user wants to generate the same test and train dataset everytime, he should specify a seed value.

Value

a list with two elements: train and test set

Author(s)

Akash Jain

Examples

# A 'data.frame'
df <- data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                 y = c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'),
                 z = c(1, 1, 0, 0, 1, 0, 0, 1, 1, 0))

# Split data into train (70%) and test (30%)
ltData <- splitdata(data = df, fraction = 0.7, seed = 123)
trainData <- ltData$train
testData <- ltData$test

StatMeasures

Easy Data Manipulation, Data Quality and Statistical Checks

v1.0

GPL-2

Authors

Akash Jain

Initial release

2015-03-24

splitdata

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

StatMeasures

We don't support your browser anymore