Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

dqcontinuous

Data quality check of continuous variables


Description

Takes in a data, and returns summary of continuous variables

Usage

dqcontinuous(data)

Arguments

data

a data.frame or data.table

Details

It is of utmost importance to know the distribution of continuous variables in the data. dqcontinuous produces an output which tells - continuous variable, non-missing values, missing values, percentage missing, minumum, average, maximum, standard deviation, variance, common percentiles from 1 to 99, and number of outliers for each continuous variable.

The function tags all integer and numeric variables as continuous, and produces output for them; if you think there are some variables which are integer or numeric in the data but they don't represent a continuous variable, change their type to an appropriate class.

dqcontinuous uses the same criteria to identify outliers as the one used for box plots. All values that are greater than 75th percentile value + 1.5 times the inter quartile range or lesser than 25th percentile value - 1.5 times the inter quartile range, are tagged as outliers.

This function works for both 'data.frame and 'data.table' but returns a 'data.frame' only.

Value

a data.frame which contains the non-missing values, missing values, percentage of missing values, mimimum, mean, maximum, standard deviation, variance, percentiles and count of outliers of all integer and numeric variables

Author(s)

Akash Jain

See Also

Examples

# A 'data.frame'
df <- data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                 y = c(22, NA, 66, 12, 78, 34, 590, 97, 56, 37))

# Generate a data quality report of continuous variables
summaryContinuous <- dqcontinuous(data = df)

StatMeasures

Easy Data Manipulation, Data Quality and Statistical Checks

v1.0
GPL-2
Authors
Akash Jain
Initial release
2015-03-24

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.