bigQueryR: bqr_upload_data – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

bigQueryR

bqr_upload_data

Upload data to BigQuery

Description

Upload data to BigQuery

Usage

bqr_upload_data(projectId = bqr_get_global_project(),
  datasetId = bqr_get_global_dataset(), tableId, upload_data,
  create = c("CREATE_IF_NEEDED", "CREATE_NEVER"),
  writeDisposition = c("WRITE_TRUNCATE", "WRITE_APPEND", "WRITE_EMPTY"),
  schema = NULL, sourceFormat = c("CSV", "DATASTORE_BACKUP",
  "NEWLINE_DELIMITED_JSON", "AVRO"), wait = TRUE, autodetect = FALSE,
  nullMarker = NULL, maxBadRecords = NULL, allowJaggedRows = FALSE,
  allowQuotedNewlines = FALSE, fieldDelimiter = NULL)

Arguments

`projectId`	The BigQuery project ID.
`datasetId`	A datasetId within projectId.
`tableId`	ID of table where data will end up.
`upload_data`	The data to upload, a data.frame object or a Google Cloud Storage URI
`create`	Whether to create a new table if necessary, or error if it already exists.
`writeDisposition`	How to add the data to a table.
`schema`	If `upload_data` is a Google Cloud Storage URI, supply the data schema. For `CSV` a helper function is available by using schema_fields on a data sample
`sourceFormat`	If `upload_data` is a Google Cloud Storage URI, supply the data format. Default is `CSV`
`wait`	If uploading a data.frame, whether to wait for it to upload before returning
`autodetect`	Experimental feature that auto-detects schema for CSV and JSON files
`nullMarker`	Specifies a string that represents a null value in a CSV file. For example, if you specify `\N`, BigQuery interprets `\N` as a null value when loading a CSV file. The default value is the empty string.
`maxBadRecords`	The maximum number of bad records that BigQuery can ignore when running the job
`allowJaggedRows`	Whether to allow rows with variable length columns
`allowQuotedNewlines`	Whether to allow datasets with quoted new lines
`fieldDelimiter`	The separator for fields in a CSV file. Default is comma - `,`

Details

A temporary csv file is created when uploading from a local data.frame

For larger file sizes up to 5TB, upload to Google Cloud Storage first via gcs_upload then supply the object URI of the form gs://project-name/object-name to the upload_data argument.

You also need to supply a data schema. Remember that the file should not have a header row.

Value

TRUE if successful, FALSE if not.

Examples

## Not run: 

 library(googleCloudStorageR)
 library(bigQueryR)
 
 gcs_global_bucket("your-project")
 
 ## custom upload function to ignore quotes and column headers
 f <- function(input, output) {
   write.table(input, sep = ",", col.names = FALSE, row.names = FALSE, 
               quote = FALSE, file = output, qmethod = "double")}
   
 ## upload files to Google Cloud Storage
 gcs_upload(mtcars, name = "mtcars_test1.csv", object_function = f)
 gcs_upload(mtcars, name = "mtcars_test2.csv", object_function = f)
 
 ## create the schema of the files you just uploaded
 user_schema <- schema_fields(mtcars)
 
 ## load files from Google Cloud Storage into BigQuery
 bqr_upload_data(projectId = "your-project", 
                datasetId = "test", 
                tableId = "from_gcs_mtcars", 
                upload_data = c("gs://your-project/mtcars_test1.csv", 
                                "gs://your-project/mtcars_test2.csv"),
                schema = user_schema)
 
 ## for big files, its helpful to create your schema on a small sample
 ## a quick way to do this on the command line is:
 # "head bigfile.csv > head_bigfile.csv"

## upload nested lists as JSON
the_list <- list(list(col1 = "yes", col2 = "no", 
                      col3 = list(nest1 = 1, nest2 = 3), col4 = "oh"),
                 list(col1 = "yes2", 
                      col2 = "n2o", col3 = list(nest1 = 5, nest2 = 7), 
                      col4 = "oh2"), 
                 list(col1 = "yes3", col2 = "no3", 
                      col3 = list(nest1 = 7, nest2 = 55), col4 = "oh3"))
   
bqr_upload_data(datasetId = "test", 
                tableId = "nested_list_json", 
                upload_data = the_list, 
                autodetect = TRUE)


## End(Not run)

bigQueryR

Interface with Google BigQuery with Shiny Compatibility

v0.5.0

MIT + file LICENSE

Authors

Mark Edmondson [aut, cre], Hadley Wickham [ctb]

Initial release

bqr_upload_data

Description

Usage

Arguments

Details

Value

See Also

Examples

bigQueryR

We don't support your browser anymore