You find a package on Github named
datacleaner which is supposed to clean data sets and handle
NA values accordingly: https://github.com/Quantargo/datacleaner. Additionally, the package is supposed to deal with outliers. Currently, the package implements the functions
meanimpute() to do data winsorization and mean imputation on data sets.
You decide to use the package from Github but unfortunately some functionality seems to be missing and it some checks seem to be failing. Specifically, you find the following critical points:
windsorize()does not seem to be working correctly.
windsorize()should give an error with a clear message if either an empty vector or a vector containing only
NA’s is passed. Hint: Use the function
stop()to throw errors if conditions are met.
transform_log()to log-transform values. The function should give an error or ideally hints to workarounds if numerical errors arise.
R CMD check(or
devtools::check(), Button Check in the Build tab in RStudio) fails with errors, warnings and notes.
?windsorize. Additionally, function examples are missing, e.g., use a vector created using
rnorm()to explain how
windsorize()is working or
Requirements: This exercise requires a Github account. Please create one if not already done as described in the Git chapter.