Create and set up a Cloud resource connection
As a BigQuery administrator, you can create a Cloud resource connection that enables data analysts to perform the following tasks:
- Query structured Cloud Storage data using BigLake tables. BigLake tables enable you to query external data with access delegation.
- Query unstructured data in Cloud Storage using object tables.
- Implement remote functions with any supported languages in Cloud Run functions or Cloud Run.
For more information about connections, see Introduction to connections.
Before you begin
-
Enable the BigQuery Connection API.
-
To get the permissions that you need to create a Cloud Resource connection, ask your administrator to grant you the following IAM roles:
-
BigQuery Connection Admin (
roles/bigquery.connectionAdmin
) on the project -
Storage Object Viewer (
roles/storage.objectViewer
) on the bucket
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
If you want to query structured data using BigLake tables based on Cloud Storage or unstructured data using object tables, then the service account associated with the connection must also have the Storage Viewer (roles/storage.viewer
) role on the bucket that contains the external data.
-
BigQuery Connection Admin (
- Ensure that your version of the Google Cloud SDK is 366.0.0 or later:
gcloud version
If needed, update the Google Cloud SDK.
- Optional: For Terraform, use Terraform GCP version 4.25.0 or later. You can download the latest version from HashiCorp Terraform downloads.
Location consideration
When you use Cloud Storage to store data files, we recommend that you use Cloud Storage single-region or dual-region buckets for optimal performance, not multi-region buckets.
Create Cloud resource connections
BigLake uses a connection to access Cloud Storage. You can use this connection with a single table or a group of tables.
Select one of the following options:
Console
Go to the BigQuery page.
To create a connection, click
Add, and then click Connections to external data sources.In the Connection type list, select Vertex AI remote models, remote functions and BigLake (Cloud Resource).
In the Connection ID field, enter a name for your connection.
Click Create connection.
Click Go to connection.
In the Connection info pane, copy the service account ID for use in a later step.
bq
In a command-line environment, create a connection:
bq mk --connection --location=REGION --project_id=PROJECT_ID \ --connection_type=CLOUD_RESOURCE CONNECTION_ID
The
--project_id
parameter overrides the default project.Replace the following:
REGION
: your connection regionPROJECT_ID
: your Google Cloud project IDCONNECTION_ID
: an ID for your connection
When you create a connection resource, BigQuery creates a unique system service account and associates it with the connection.
Troubleshooting: If you get the following connection error, update the Google Cloud SDK:
Flags parsing error: flag --connection_type=CLOUD_RESOURCE: value should be one of...
Retrieve and copy the service account ID for use in a later step:
bq show --connection PROJECT_ID.REGION.CONNECTION_ID
The output is similar to the following:
name properties 1234.REGION.CONNECTION_ID {"serviceAccountId": "connection-1234-9u56h9@gcp-sa-bigquery-condel.iam.gserviceaccount.com"}
Terraform
Use the
google_bigquery_connection
resource.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
The following example creates a Cloud resource connection named
my_cloud_resource_connection
in the US
region:
To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.
Prepare Cloud Shell
- Launch Cloud Shell.
-
Set the default Google Cloud project where you want to apply your Terraform configurations.
You only need to run this command once per project, and you can run it in any directory.
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
Environment variables are overridden if you set explicit values in the Terraform configuration file.
Prepare the directory
Each Terraform configuration file must have its own directory (also called a root module).
-
In Cloud Shell, create a directory and a new
file within that directory. The filename must have the
.tf
extension—for examplemain.tf
. In this tutorial, the file is referred to asmain.tf
.mkdir DIRECTORY && cd DIRECTORY && touch main.tf
-
If you are following a tutorial, you can copy the sample code in each section or step.
Copy the sample code into the newly created
main.tf
.Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
- Review and modify the sample parameters to apply to your environment.
- Save your changes.
-
Initialize Terraform. You only need to do this once per directory.
terraform init
Optionally, to use the latest Google provider version, include the
-upgrade
option:terraform init -upgrade
Apply the changes
-
Review the configuration and verify that the resources that Terraform is going to create or
update match your expectations:
terraform plan
Make corrections to the configuration as necessary.
-
Apply the Terraform configuration by running the following command and entering
yes
at the prompt:terraform apply
Wait until Terraform displays the "Apply complete!" message.
- Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.
Grant access to the service account
To create remote functions, you must grant required roles to Cloud Run functions or Cloud Run.
To connect to Cloud Storage, you must give the new connection read-only access to Cloud Storage so that BigQuery can access files on behalf of users.
Select one of the following options:
Console
We recommend that you grant the connection resource service account the
Storage Object Viewer IAM role
(roles/storage.objectViewer
), which lets the service account access
Cloud Storage buckets.
Go to the IAM & Admin page.
Click
Add.The Add principals dialog opens.
In the New principals field, enter the service account ID that you copied earlier.
In the Select a role field, select Cloud Storage, and then select Storage Object Viewer.
Click Save.
gcloud
Use the gcloud storage buckets add-iam-policy-binding
command:
gcloud storage buckets add-iam-policy-binding gs://BUCKET \ --member=serviceAccount:MEMBER \ --role=roles/storage.objectViewer
Replace the following:
BUCKET
: the name of your storage bucket.MEMBER
: the service account ID that you copied earlier.
For more information, see Add a principal to a bucket-level policy.
Terraform
Use the
google_bigquery_connection
resource.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
The following example grants IAM role access to the service account of the Cloud resource connection:
To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.
Prepare Cloud Shell
- Launch Cloud Shell.
-
Set the default Google Cloud project where you want to apply your Terraform configurations.
You only need to run this command once per project, and you can run it in any directory.
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
Environment variables are overridden if you set explicit values in the Terraform configuration file.
Prepare the directory
Each Terraform configuration file must have its own directory (also called a root module).
-
In Cloud Shell, create a directory and a new
file within that directory. The filename must have the
.tf
extension—for examplemain.tf
. In this tutorial, the file is referred to asmain.tf
.mkdir DIRECTORY && cd DIRECTORY && touch main.tf
-
If you are following a tutorial, you can copy the sample code in each section or step.
Copy the sample code into the newly created
main.tf
.Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
- Review and modify the sample parameters to apply to your environment.
- Save your changes.
-
Initialize Terraform. You only need to do this once per directory.
terraform init
Optionally, to use the latest Google provider version, include the
-upgrade
option:terraform init -upgrade
Apply the changes
-
Review the configuration and verify that the resources that Terraform is going to create or
update match your expectations:
terraform plan
Make corrections to the configuration as necessary.
-
Apply the Terraform configuration by running the following command and entering
yes
at the prompt:terraform apply
Wait until Terraform displays the "Apply complete!" message.
- Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.
Share connections with users
You can grant the following roles to let users query data and manage connections:
roles/bigquery.connectionUser
: enables users to use connections to connect with external data sources and run queries on them.roles/bigquery.connectionAdmin
: enables users to manage connections.
For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Select one of the following options:
Console
Go to the BigQuery page.
Connections are listed in your project, in a group called External connections.
In the Explorer pane, click your project name > External connections > connection.
In the Details pane, click Share to share a connection. Then do the following:
In the Connection permissions dialog, share the connection with other principals by adding or editing principals.
Click Save.
bq
You cannot share a connection with the bq command-line tool. To share a connection, use the Google Cloud console or the BigQuery Connections API method to share a connection.
API
Use the
projects.locations.connections.setIAM
method
in the BigQuery Connections REST API reference section, and
supply an instance of the policy
resource.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
What's next
- Learn about different connection types.
- Learn about managing connections.
- Learn about BigLake tables.
- Learn how to create BigLake tables.
- Learn how to upgrade external tables to BigLake tables.
- Learn about object tables and how to create them.
- Learn how to implement remote functions.