# Google BigQuery


Google BigQuery (opens new window) is a serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and machine learning built in. The Workato connector to Google BigQuery allows you to automate various actions on datasets in your BigQuery instance such as inserting rows or performing queries on existing datasets. You are also able to trigger recipes off new rows in datasets.

# API version

The Google BigQuery connector uses the Google BigQuery API v2 (opens new window).

# How to connect to BigQuery on Workato

The BigQuery connector allows two types of authentication : OAuth 2.0 and Service Account.

Configuring BigQuery connection

Field Description
Connection name Give this BigQuery connection a unique name that identifies which BigQuery instance it is connected to.
Authentication Type Choose OAuth 2.0 when using a user account and Service Account when using a service account to connect to BigQuery.
GCP Project service account email (Applicable only to Service Account Auth) The email of your service account
Private key (Applicable only to Service Account Auth) The private of your service account
Advanced settings (Applicable only to Service Account Auth) Adjust the scopes of your connection.

# Service Account

You can also authenticate to BigQuery using a Google Cloud service account. A service account is a special type of Google account that is associated with your Google Cloud Project that can be used to run API requests on your behalf. Service accounts can be used in BigQuery to ensure that the solution will continue running even if individual users' permissions change. Read more about service accounts here (opens new window).

To create a service account, you need to log into your Google Cloud Platform (GCP) console. Follow the guide here (opens new window) to create a new service account in your GCP project. Follow this guide (opens new window) to add a new private key and download the key in JSON format. Note that after you download the key file, you cannot download it again.

Copy the private key from -----BEGIN PRIVATE KEY----- to -----END PRIVATE KEY-----\n both inclusive and paste it as input on Workato.

Getting GCP Project service account email

# Setting permissions

For the service account to use all actions and triggers in the BigQuery connector, the service account should be given the BigQuery Admin IAM role or custom role with the following 11 permissions:

  • bigquery.datasets.get
  • bigquery.jobs.create
  • bigquery.jobs.get
  • bigquery.jobs.list
  • bigquery.jobs.listAll
  • bigquery.tables.create
  • bigquery.tables.get
  • bigquery.tables.getData
  • bigquery.tables.list
  • bigquery.tables.update
  • bigquery.tables.updateData

TIP

If you would like to only insert data with this connection on Workato, you may:

  • Use the BigQuery Data Editor role.
  • Add the bigquery.jobs.create permission.

If you would like to only select data with this connection on Workato, you may:

  • Use the BigQuery Data Viewer role.
  • Add the bigquery.jobs.create permission.

BigQuery custom role with 11 assigned permissions

Read this guide (opens new window) to create custom roles for the service account on Google Cloud Platform. More information on each of the required permissions above can be found in the list of BigQuery permissions (opens new window).

Note that when the service account with a custom role is used, Project ID must be supplied directly on Workato as the Select project dropdown in Setup will not load and you will need to provide the Project ID manually.

# Rate limits on BigQuery

BigQuery's rate limits (opens new window) on tables indicates that bulk insert operations on tables can only be performed 1000 times a day. To bypass these limits, use the Insert row and Insert rows actions which are not affected by such limits.

# Working with the BigQuery connector

After establishing a connection with the BigQuery connector, most actions will require some additional parameter inputs.

Field Description
Project The project available in the connection to be billed for the query.
Dataset The dataset which the action or trigger will pull the possible tables from.
Table The table inside the dataset.
Location The geographical location of where the job should be run.

# Single row vs batch of rows

The BigQuery connector can read or write to your database either as a single row or in batches. When using batch read actions/triggers, you have to provide the batch size you wish to work with. The batch size can be any number between 1 and 50000, with 50000 being the maximum batch size.

Batch trigger inputs Batch trigger inputs

Besides the difference in input fields, there is also a difference between the outputs of these 2 types of operations. A trigger that processes rows one at a time will have an output datatree that allows you to map data from that single row.

Single row output Single row output

However, a trigger that processes rows in batches will output them as an array of rows. The Rows datapill indicates that the output is a list containing data for each row in that batch.

Batch trigger output Batch trigger output

As a result, the output of batch triggers/actions needs to be handled differently. In cases where there are downstream batch actions which accept Rows source list input fields, you'll be able to map the rows array pill to it to tell Workato you want to map the entire array.

Using batch trigger output Using batch trigger output

# WHERE condition

This input field is used to filter and identify rows to perform an action on. It is used in multiple triggers and actions in the following ways:

  • filter rows to be picked up in triggers
  • filter rows in Select rows action
  • filter rows to be deleted in Delete rows action

This clause will be used as a WHERE statement in each request. This should follow basic SQL syntax. Refer to this BigQuery documentation (opens new window) for a full list of rules for writing SQL statements compatible with BigQuery.

# Simple statements

Knowing the data types of the column in BigQuery are important to build working queries. When comparing string values, values must be enclosed in single quotes ('') and columns used must exist in the table. When comparing integer values, the supplied value should not be enclosed in single quotes.

A simple WHERE condition to filter rows based on values in a single column looks like this.

string_column = 'USD' and integer_column = 1111

If used in a Select rows action, this WHERE condition will return all rows that have the value 'USD' in the currency column. Just remember to wrap datapills with single quotes in your inputs.

Using datapills in WHERE condition Using datapills in WHERE condition

# Complex statements

Your WHERE condition can also contain subqueries. The following query can be used on the users table.

id in (select distinct(user_id) from zendesk.tickets where priority = 2)

When used in a Delete rows action, this will delete all rows in the users table where at least one associated row in the tickets table has a value of 2 in the priority column.

Using datapills in WHERE condition with subquery Using datapills in WHERE condition with subquery

# Defining your output fields

Sometimes, actions, and triggers in BigQuery will allow you to define the expected output columns a query. This input field shows up in the following triggers and actions:

  • Scheduled query trigger
  • Get query job output
  • Select rows using custom SQL

Define your output fields easily using the output schema designer by using our CSV uploader. In these cases, simply run a sample query in the BigQuery console like above and export a CSV.

Export CSV from BigQuery

Upload this CSV into the schema wizard and automatically generate all fields for you.

CSV schema wizard


Last updated: 5/17/2024, 2:35:03 AM