GCP, BigQuery, Spanner Integration Connection Guide

Transcend's Google Cloud Platform (GCP) integration scans GCP projects to identify data systems that may contain personal data. Transcend also supports a DSR integration for Google's BigQuery database, alongside scanning your database to discover and classify data. The Google Spanner integration also supports database schema scanning to discover and classify data. This guide provides an overview on how these integrations work as well as detailed setup instructions.

Transcend's GCP integration automates the process of identifying data systems across Google cloud infrastructure. It includes things like BigQuery, CloudSQL, Cloud Storage, etc.

The integration's System Discovery plugin works by programmatically scanning each project to surface the cloud services configured for each using Google's list:services method.

Diagram depicting the integration discovering cloud services

For each service discovered, the integration will recommend a data system representing the service. It's probable that more than one data system will be recommended for the same service if it's used in multiple projects. For example, if two BigQuery instances are used in two different projects, the integration will recommend two BigQuery data systems. In this way a system is recommended for each distinct data system.

The integration is authenticated with a Service User created for a dedicated GCP Project. Using a service account to connect the integration is a more secure option for this integration, as it allows for sensitive permissions to be assigned without giving a person user the same permissions. Additionally, it doesn't count as a user seat in the Google Organization. Continue to the next section for additional details about authentication and setting up the integration.

Transcend uses a client credentials method to connect to your organization's Google Cloud Platform projects. There are a few steps involved to generate credentials specific to your Google organization.

Please ensure you have the following requirements in place prior to starting the connection:

Confirm access to your organization's Google Cloud Console
Confirm you have permissions to create a new project and provision a service account.

Create a dedicated project for the integration in your organization's Google Cloud Console, and enable the following APIs depending on which integration you are trying to connect:

GCP:
- Cloud Resource Manager API
- Service Usage API
BigQuery:
- BigQuery API
Spanner:
- Spanner API

If a project was previously created for another Transcend Google integration, there's no need to create another project. Feel free to use the existing project.

Transcend recommends creating a dedicated service account to connect this integration, even if another service user has been configured for another Transcend integration. Creating a service user with limited scope for each integration reduces the risk of superpowered accounts.

Navigate to the "IAM & Admin" tab for the desired project and select "Service Accounts" > select Create Service Account. Give the service account a name you'll remember, for example, "transcend-integration".

Make note of the email address associated with this service account — you'll need it to grant access to the GCP projects

A set of public-private key pairs for this account is needed to be used in the Transcend Connection form. You can create the key by:

Visiting the "Key" tab in the service account's settings page and selecting Add Key. Make sure to select JSON as the key type.
This will download a key file to your computer. You will need the JSON key file during the connection phase for the integration - Transcend only supports key files generated in the JSON format.

Give the newly created service user access to GCP projects you want scanned or, if using Transcend's BigQuery integration, the BigQuery project you want scanned and classified.

For GCP Roles:
- Create a custom role with resourcemanager.projects.get, servicemanagement.services.bind, and serviceusage.services.list permissions.
For BigQuery Roles:
- Find role information in the next section
For Cloud Spanner Roles:
- Find role information in the Cloud Spanner Permissions section
For each project, navigate to the IAM section and select + GRANT ACCESS to add a user for the project.
Enter the email address of the the service account and assign it the appropriate roles for GCP or BigQuery
Save the permissions and repeat for each additional project desired.

Regarding BigQuery Roles

BigQuery has many predefined roles that can be used in order to fit your many needs, such as:

BigQuery Admin — Grants full admin access
BigQuery Data Viewer — Grants read-only access
BigQuery Data Editor — Grants read-only + write access
BigQuery Data Owner — Grants full access
BigQuery Job User - Grants access to query jobs

These are just a sample of the predefined roles that Google has provided. There is also the option of creating your own custom role:

Navigate to the Roles section, which can be found under IAM & Admin
Click Create Role
Edit all the necessary information (title, description, etc.)
Add the permissions that you want the account to have. You can find the common permissions here

More information regarding BigQuery Roles can be found in their Documentation. At the minimum the service account requires at least:

BigQuery Job User — In order to create queries with the BigQuery API
BigQuery Data Viewer — To have access to read BigQuery datasets and tables, to enable schema discovery and classification, and access-based DSRs

At minimum, these are the permissions required:

bigquery.connections.get
bigquery.config.get
bigquery.dataPolicies.maskedGet
bigquery.datasets.get
bigquery.datasets.getIamPolicy
bigquery.jobs.create
bigquery.models.export
bigquery.models.getData
bigquery.models.getMetadata
bigquery.models.list
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.routines.get
bigquery.routines.list
bigquery.tables.createSnapshot
bigquery.tables.export
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.getIamPolicy
bigquery.tables.list
resourcemanager.projects.get
resourcemanager.projects.getIamPolicy

Note: DSRs that require modifying data, will require the BigQuery Data Editor role instead of the BigQuery Data Viewer role, to allow both read and write access

Regarding Google Spanner Roles and Permissions

Google Spanner has some predefined roles that can be used in order to fit your many needs, such as:

Cloud Spanner Admin — Grants full admin access
Cloud Spanner Database Admin - Grants full database access

There is also the option of creating your own custom role:

Navigate to the Roles section, which can be found under IAM & Admin
Click Create Role
Edit all the necessary information (title, description, etc.)
Add the permissions that you want the account to have. You can find the common permissions here

More information regarding Spanner Roles can be found in their Documentation.

At minimum, these are the permissions required:

spanner.instances.get
spanner.instances.list
spanner.databases.list
spanner.databases.get
spanner.databases.getDdl
spanner.databases.read
spanner.databases.select
spanner.databases.write
spanner.sessions.create
spanner.sessions.delete

To complete authentication for the integration, navigate back to the Transcend dashboard and enter the following fields in the integration connection form:

Service Account's JSON Key File

Google Cloud Project ID
- Enter the project ID that contains your BigQuery Database
Service Account's JSON Key File

Google Cloud Project ID
- Enter the project ID that contains your Cloud Spanner Instance
Cloud Spanner Instance ID
- Enter your Google Cloud Instance ID, note that this is different from the display name of the instance.
Service Account's JSON Key File

In addition to the initial connection, the Google Cloud / BigQuery integration also provides functionality for data mapping. Review our Plugin Configuration for Google Cloud and BigQuery guide for more details.

GCP, BigQuery, Spanner Integration Connection Guide

Overview

How the GCP Integration Works

Prerequisites

Configuration

Provision a Project or Service Account

1. Create a New Project

2. Create a Service Account

3. Generate a Private Key

4. Grant Permissions

Regarding BigQuery Roles

Regarding Google Spanner Roles and Permissions

Complete Transcend's Connection Form

GCP

BigQuery

Cloud Spanner

Additional Configuration Guides