Upload data to BigQuery
Upload data to BigQuery
bqr_upload_data(projectId = bqr_get_global_project(), datasetId = bqr_get_global_dataset(), tableId, upload_data, create = c("CREATE_IF_NEEDED", "CREATE_NEVER"), writeDisposition = c("WRITE_TRUNCATE", "WRITE_APPEND", "WRITE_EMPTY"), schema = NULL, sourceFormat = c("CSV", "DATASTORE_BACKUP", "NEWLINE_DELIMITED_JSON", "AVRO"), wait = TRUE, autodetect = FALSE, nullMarker = NULL, maxBadRecords = NULL, allowJaggedRows = FALSE, allowQuotedNewlines = FALSE, fieldDelimiter = NULL)
projectId |
The BigQuery project ID. |
datasetId |
A datasetId within projectId. |
tableId |
ID of table where data will end up. |
upload_data |
The data to upload, a data.frame object or a Google Cloud Storage URI |
create |
Whether to create a new table if necessary, or error if it already exists. |
writeDisposition |
How to add the data to a table. |
schema |
If |
sourceFormat |
If |
wait |
If uploading a data.frame, whether to wait for it to upload before returning |
autodetect |
Experimental feature that auto-detects schema for CSV and JSON files |
nullMarker |
Specifies a string that represents a null value in a CSV file.
For example, if you specify |
maxBadRecords |
The maximum number of bad records that BigQuery can ignore when running the job |
allowJaggedRows |
Whether to allow rows with variable length columns |
allowQuotedNewlines |
Whether to allow datasets with quoted new lines |
fieldDelimiter |
The separator for fields in a CSV file. Default is comma - |
A temporary csv file is created when uploading from a local data.frame
For larger file sizes up to 5TB, upload to Google Cloud Storage first via gcs_upload then supply the object URI of the form gs://project-name/object-name
to the upload_data
argument.
You also need to supply a data schema. Remember that the file should not have a header row.
TRUE if successful, FALSE if not.
urlhttps://cloud.google.com/bigquery/loading-data-post-request
## Not run: library(googleCloudStorageR) library(bigQueryR) gcs_global_bucket("your-project") ## custom upload function to ignore quotes and column headers f <- function(input, output) { write.table(input, sep = ",", col.names = FALSE, row.names = FALSE, quote = FALSE, file = output, qmethod = "double")} ## upload files to Google Cloud Storage gcs_upload(mtcars, name = "mtcars_test1.csv", object_function = f) gcs_upload(mtcars, name = "mtcars_test2.csv", object_function = f) ## create the schema of the files you just uploaded user_schema <- schema_fields(mtcars) ## load files from Google Cloud Storage into BigQuery bqr_upload_data(projectId = "your-project", datasetId = "test", tableId = "from_gcs_mtcars", upload_data = c("gs://your-project/mtcars_test1.csv", "gs://your-project/mtcars_test2.csv"), schema = user_schema) ## for big files, its helpful to create your schema on a small sample ## a quick way to do this on the command line is: # "head bigfile.csv > head_bigfile.csv" ## upload nested lists as JSON the_list <- list(list(col1 = "yes", col2 = "no", col3 = list(nest1 = 1, nest2 = 3), col4 = "oh"), list(col1 = "yes2", col2 = "n2o", col3 = list(nest1 = 5, nest2 = 7), col4 = "oh2"), list(col1 = "yes3", col2 = "no3", col3 = list(nest1 = 7, nest2 = 55), col4 = "oh3")) bqr_upload_data(datasetId = "test", tableId = "nested_list_json", upload_data = the_list, autodetect = TRUE) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.