Extract data asynchronously
Use this instead of bqr_query for big datasets. Requires you to make a bucket at https://console.cloud.google.com/storage/browser
bqr_extract_data(projectId = bqr_get_global_project(),
datasetId = bqr_get_global_dataset(), tableId, cloudStorageBucket,
filename = paste0("big-query-extract-", gsub(" |:|-", "", Sys.time()),
"-*.csv"), compression = c("NONE", "GZIP"),
destinationFormat = c("CSV", "NEWLINE_DELIMITED_JSON", "AVRO"),
fieldDelimiter = ",", printHeader = TRUE)projectId |
The BigQuery project ID. |
datasetId |
A datasetId within projectId. |
tableId |
ID of table you wish to extract. |
cloudStorageBucket |
URI of the bucket to extract into. |
filename |
Include a wildcard (*) if extract expected to be > 1GB. |
compression |
Compression of file. |
destinationFormat |
Format of file. |
fieldDelimiter |
fieldDelimiter of file. |
printHeader |
Whether to include header row. |
A Job object to be queried via bqr_get_job
Other BigQuery asynch query functions: bqr_download_extract,
bqr_get_job,
bqr_grant_extract_access,
bqr_query_asynch,
bqr_wait_for_job
## Not run:
library(bigQueryR)
## Auth with a project that has at least BigQuery and Google Cloud Storage scope
bqr_auth()
## make a big query
job <- bqr_query_asynch("your_project",
"your_dataset",
"SELECT * FROM blah LIMIT 9999999",
destinationTableId = "bigResultTable")
## poll the job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job)
##once done, the query results are in "bigResultTable"
## extract that table to GoogleCloudStorage:
# Create a bucket at Google Cloud Storage at
# https://console.cloud.google.com/storage/browser
job_extract <- bqr_extract_data("your_project",
"your_dataset",
"bigResultTable",
"your_cloud_storage_bucket_name")
## poll the extract job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job_extract$jobReference$jobId)
You should also see the extract in the Google Cloud Storage bucket
googleCloudStorageR::gcs_list_objects("your_cloud_storage_bucket_name")
## to download via a URL and not logging in via Google Cloud Storage interface:
## Use an email that is Google account enabled
## Requires scopes:
## https://www.googleapis.com/auth/devstorage.full_control
## https://www.googleapis.com/auth/cloud-platform
download_url <- bqr_grant_extract_access(job_extract, "your@email.com")
## download_url may be multiple if the data is > 1GB
## End(Not run)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.