bigQueryR: bqr_extract_data – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

bqr_extract_data

Extract data asynchronously

Description

Use this instead of bqr_query for big datasets. Requires you to make a bucket at https://console.cloud.google.com/storage/browser

Usage

bqr_extract_data(projectId = bqr_get_global_project(),
  datasetId = bqr_get_global_dataset(), tableId, cloudStorageBucket,
  filename = paste0("big-query-extract-", gsub(" |:|-", "", Sys.time()),
  "-*.csv"), compression = c("NONE", "GZIP"),
  destinationFormat = c("CSV", "NEWLINE_DELIMITED_JSON", "AVRO"),
  fieldDelimiter = ",", printHeader = TRUE)

Arguments

`projectId`	The BigQuery project ID.
`datasetId`	A datasetId within projectId.
`tableId`	ID of table you wish to extract.
`cloudStorageBucket`	URI of the bucket to extract into.
`filename`	Include a wildcard (*) if extract expected to be > 1GB.
`compression`	Compression of file.
`destinationFormat`	Format of file.
`fieldDelimiter`	fieldDelimiter of file.
`printHeader`	Whether to include header row.

Value

A Job object to be queried via bqr_get_job

Examples

## Not run: 
library(bigQueryR)

## Auth with a project that has at least BigQuery and Google Cloud Storage scope
bqr_auth()

## make a big query
job <- bqr_query_asynch("your_project", 
                        "your_dataset",
                        "SELECT * FROM blah LIMIT 9999999", 
                        destinationTableId = "bigResultTable")
                        
## poll the job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job)

##once done, the query results are in "bigResultTable"
## extract that table to GoogleCloudStorage:
# Create a bucket at Google Cloud Storage at 
# https://console.cloud.google.com/storage/browser

job_extract <- bqr_extract_data("your_project",
                                "your_dataset",
                                "bigResultTable",
                                "your_cloud_storage_bucket_name")
                                
## poll the extract job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job_extract$jobReference$jobId)

You should also see the extract in the Google Cloud Storage bucket
googleCloudStorageR::gcs_list_objects("your_cloud_storage_bucket_name")

## to download via a URL and not logging in via Google Cloud Storage interface:
## Use an email that is Google account enabled
## Requires scopes:
##  https://www.googleapis.com/auth/devstorage.full_control
##  https://www.googleapis.com/auth/cloud-platform

download_url <- bqr_grant_extract_access(job_extract, "your@email.com")

## download_url may be multiple if the data is > 1GB


## End(Not run)

bigQueryR

Interface with Google BigQuery with Shiny Compatibility

v0.5.0

MIT + file LICENSE

Authors

Mark Edmondson [aut, cre], Hadley Wickham [ctb]

Initial release