Remove observations with duplicate keys from data
Takes in a data and key, and returns data with duplicate observations by key removed
rmdupkey(data, by)
data |
a data.frame or data.table |
by |
a character vector of keys to be used |
Remove duplicate observations by key(s) is what this function does. How it is
different from other functions that remove duplicates is that rmdupkey
works for both 'data.frame' and 'data.table', and it also returns the duplicated
observations.
Many a times we want to go back to the duplicated observations and see why that duplication occured. One can pick the duplicated observations using the code given in example.
a two element list: unique data and duplicate data
Akash Jain
# A 'data.frame' df <- data.frame(x = c(1, 2, 1, 1), y = c(3, 3, 1, 3)) # Remove duplicate observations by key from data ltDf <- rmdupkey(data = df, by = c('x')) unqDf <- ltDf$unqData dupDf <- ltDf$dupData
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.