dbplyr: pivot_wider.tbl_lazy – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

pivot_wider.tbl_lazy

Pivot data from long to wide

Description

pivot_wider() "widens" data, increasing the number of columns and decreasing the number of rows. The inverse transformation is pivot_longer(). Learn more in vignette("pivot", "tidyr").

Usage

pivot_wider.tbl_lazy(
  data,
  id_cols = NULL,
  names_from = name,
  names_prefix = "",
  names_sep = "_",
  names_glue = NULL,
  names_sort = FALSE,
  names_repair = "check_unique",
  values_from = value,
  values_fill = NULL,
  values_fn = max,
  ...
)

Arguments

`data`	A lazy data frame backed by a database query.
`id_cols`	A set of columns that uniquely identifies each observation.
`names_from, values_from`	A pair of arguments describing which column (or columns) to get the name of the output column (`names_from`), and which column (or columns) to get the cell values from (`values_from`). If `values_from` contains multiple values, the value will be added to the front of the output column.
`names_prefix`	String added to the start of every variable name.
`names_sep`	If `names_from` or `values_from` contains multiple variables, this will be used to join their values together into a single string to use as a column name.
`names_glue`	Instead of `names_sep` and `names_prefix`, you can supply a glue specification that uses the `names_from` columns (and special `.value`) to create custom column names.
`names_sort`	Should the column names be sorted? If `FALSE`, the default, column names are ordered by first appearance.
`names_repair`	What happens if the output has invalid column names?
`values_fill`	Optionally, a (scalar) value that specifies what each `value` should be filled in with when missing.
`values_fn`	A function, the default is `max()`, applied to the `value` in each cell in the output. In contrast to local data frames it must not be `NULL`.
`...`	Unused; included for compatibility with generic.

Details

The big difference to pivot_wider() for local data frames is that values_fn must not be NULL. By default it is max() which yields the same results as for local data frames if the combination of id_cols and value column uniquely identify an observation. Mind that you also do not get a warning if an observation is not uniquely identified.

The translation to SQL code basically works as follows:

Get unique keys in names_from column.

For each key value generate an expression of the form:

value_fn(
  CASE WHEN (`names from column` == `key value`)
  THEN (`value column`)
  END
) AS `output column`

Group data by id columns.
Summarise the grouped data with the expressions from step 2.

Examples

if (require("tidyr", quietly = TRUE)) {
memdb_frame(
  id = 1,
  key = c("x", "y"),
  value = 1:2
) %>%
  tidyr::pivot_wider(
    id_cols = id,
    names_from = key,
    values_from = value
  )
}

dbplyr

A 'dplyr' Back End for Databases

v2.1.1

MIT + file LICENSE

Authors

Hadley Wickham [aut, cre], Maximilian Girlich [aut], Edgar Ruiz [aut], RStudio [cph, fnd]

Initial release