Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

get_dupes

Get rows of a data.frame with identical values for the specified variables.


Description

For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.

Usage

get_dupes(dat, ...)

Arguments

dat

The input data.frame.

...

Unquoted variable names to search for duplicates. This takes a tidyselect specification.

Value

Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df, the output is as well.

Examples

get_dupes(mtcars, mpg, hp)

# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)

# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))

janitor

Simple Tools for Examining and Cleaning Dirty Data

v2.1.0
MIT + file LICENSE
Authors
Sam Firke [aut, cre], Bill Denney [ctb], Chris Haid [ctb], Ryan Knight [ctb], Malte Grosser [ctb], Jonathan Zadra [ctb]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.