Dump, replace and fill missing values in data.frame
A set of tools to deal with missing values in data.frames. It can dump, replace, fill (with next or previous observation) or delete entries according to their missing values.
drop_na_dt(.data, ...) replace_na_dt(.data, ..., to) delete_na_cols(.data, prop = NULL, n = NULL) delete_na_rows(.data, prop = NULL, n = NULL) fill_na_dt(.data, ..., direction = "down") shift_fill(x, direction = "down")
.data |
data.frame |
... |
Colunms to be replaced or filled. If not specified, use all columns. |
to |
What value should NA replace by? |
prop |
If proportion of NAs is larger than or equal to "prop", would be deleted. |
n |
If number of NAs is larger than or equal to "n", would be deleted. |
direction |
Direction in which to fill missing values. Currently either "down" (the default) or "up". |
x |
A vector with missing values to be filled. |
drop_na_dt
drops the entries with NAs in specific columns.
fill_na_dt
fill NAs with observations ahead ("down") or below ("up"),
which is also known as last observation carried forward (LOCF) and
next observation carried backward(NOCB).
delete_na_cols
could drop the columns with NA proportion larger
than or equal to "prop" or NA number larger than or equal to "n",
delete_na_rows
works alike but deals with rows.
shift_fill
could fill a vector with missing values.
data.table
https://stackoverflow.com/questions/23597140/how-to-find-the-percentage-of-nas-in-a-data-frame
https://stackoverflow.com/questions/2643939/remove-columns-from-dataframe-where-all-values-are-na
https://stackoverflow.com/questions/7235657/fastest-way-to-replace-nas-in-a-large-data-table
df <- data.table(x = c(1, 2, NA), y = c("a", NA, "b")) df %>% drop_na_dt() df %>% drop_na_dt(x) df %>% drop_na_dt(y) df %>% drop_na_dt(x,y) df %>% replace_na_dt(to = 0) df %>% replace_na_dt(x,to = 0) df %>% replace_na_dt(y,to = 0) df %>% replace_na_dt(x,y,to = 0) df %>% fill_na_dt(x) df %>% fill_na_dt() # not specified, fill all columns df %>% fill_na_dt(y,direction = "up") x = data.frame(x = c(1, 2, NA, 3), y = c(NA, NA, 4, 5),z = rep(NA,4)) x x %>% delete_na_cols() x %>% delete_na_cols(prop = 0.75) x %>% delete_na_cols(prop = 0.5) x %>% delete_na_cols(prop = 0.24) x %>% delete_na_cols(n = 2) x %>% delete_na_rows(prop = 0.6) x %>% delete_na_rows(n = 2) # shift_fill y = c("a",NA,"b",NA,"c") shift_fill(y) # equals to shift_fill(y,"down") shift_fill(y,"up")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.