Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

bq_table_download

Download table data


Description

This retrieves rows in chunks of page_size. It is most suitable for results of smaller queries (<100 MB, say). For larger queries, it is better to export the results to a CSV file stored on google cloud and use the bq command line tool to download locally.

Usage

bq_table_download(
  x,
  max_results = Inf,
  page_size = 10000,
  start_index = 0L,
  max_connections = 6L,
  quiet = NA,
  bigint = c("integer", "integer64", "numeric", "character")
)

Arguments

x

A bq_table

max_results

Maximum number of results to retrieve. Use Inf retrieve all rows.

page_size

The number of rows returned per page. Make this smaller if you have many fields or large records and you are seeing a 'responseTooLarge' error.

start_index

Starting row index (zero-based).

max_connections

Number of maximum simultaneously connections to BigQuery servers.

quiet

If FALSE, displays progress bar; if TRUE is silent; if NA displays progress bar only for long-running jobs.

bigint

The R type that BigQuery's 64-bit integer types should be mapped to. The default is "integer" which returns R's integer type but results in NA for values above/below +/- 2147483647. "integer64" returns a bit64::integer64, which allows the full range of 64 bit integers.

Value

Because data retrieval may generalise list-cols and the data frame print method can have problems with list-cols, this method returns tibbles. If you need a data frame, coerce the results with as.data.frame().

Complex data

bigrquery will retrieve nested and repeated columns in to list-columns as follows:

  • Repeated values (arrays) will become a list-cols of vectors.

  • Records will become list-cols of named lists.

  • Repeated records will become list-cols of data frames.

Larger datasets

In my timings, this code takes around 1 minute per 100 MB of data. If you need to download considerably more than this, I recommend:

  • Export a .csv file to Cloud Storage using bq_table_save()

  • Use the gsutil command line utility to download it

  • Read the csv file into R with readr::read_csv() or data.table::fread().

Unfortunately you can not export nested or repeated formats into CSV, and the formats that BigQuery supports (arvn and ndjson) that allow for nested/repeated values, are not well supported in R.

API documentation

Examples

if (bq_testable()) {
df <- bq_table_download("publicdata.samples.natality", max_results = 35000)
}

bigrquery

An Interface to Google's 'BigQuery' 'API'

v1.3.2
GPL-3
Authors
Hadley Wickham [aut, cre] (<https://orcid.org/0000-0003-4757-117X>), Jennifer Bryan [aut] (<https://orcid.org/0000-0002-6983-2759>), Kungliga Tekniska Högskolan [ctb] (strptime implementation), The NetBSD Foundation, Inc. [ctb] (gmtime implementation), RStudio [cph, fnd]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.