Google Cloud Platform Integration

Transcend supports a full GCP integration that scans GCP projects to identify data storage systems that may contain personal data. This guide provides an overview on how the integration works as well as detailed setup instructions.

Transcend’s GCP integration automates the process of identifying data stores across Google cloud infrastructure. It includes things like BigQuery, CloudSQL, Cloud Storage, etc.

The integration's data silo discovery plugin works by programmatically scanning each project to surface the cloud services configured for each using Google's list:services method.

Diagram depicting the integration discovering cloud services

For each service discovered, the integration will recommend a data silo representing the service. It's probable that more than one data silo will be recommended for the same service if it's used in multiple projects. For example, if two BigQuery instances are used in two different projects, the integration will recommend two BigQuery data silos. In this way a silo is recommended for each distinct data store.

The integration is authenticated with a Service User created for a dedicated GCP Project. Using a service user account to connect the integration is a more secure option for this integration, as it allows for sensitive permissions to be assigned without giving a person user the same permissions. Additionally, it doesn't count as a user seat in the Google Organization. Continue to the next section for additional details about authentication and setting up the integration.

Transcend uses a client credentials method to connect to your organization's Google Cloud Platform projects. There are a few steps involved to generate credentials specific to your Google organization.

  1. You have access to your organization's Google Cloud Console, and have permissions to create a new project, and provision a service account.
  2. You have access to the Google Admin Console, with permissions to modify Security Settings for your organization.

Provision a Project and Service Account

  1. Create a new project. Create a dedicated project for the integration in your organization's Google Cloud Console, and enable the following APIs:

    If a GCP project was previously created for another Transcend Google integration, there's no need to create another project. Feel free to use the existing project.

  2. Create a service user account. Transcend recommends creating a dedicated service user account to connect this integration, even if another service user has been configured for another Transcend integration. Creating a service user with limited scope for each integration reduces the risk of superpowered accounts.

    Navigate to the "IAM & Admin" tab for the desired project and select "Service Accounts" > select Create Service Account. Give the service account a name you'll remember, for example, "transcend-integration".

    • You don't need to grant this service account any specific IAM roles or permissions.
    • Make note of the email address associated with this service account - you'll need it when connecting the integration.
  3. Source the client ID. Once the service account is created, select "Enable G Suite Domain-wide Delegation", and make note of the unique Client ID, as you will need to refer to this later.

    Enable Domain-wide delegation setting image

  4. Generate a private key. A set of public-private key pairs for this account is needed to be used in the Transcend Connection form. You can create the key by:

    • Visiting the "Key" tab in the service account's settings page and selecting Add Key. Make sure to select JSON as the key type.
    • This will download a key file to your computer. You will need the JSON key file during the connection phase for the integration - Transcend only supports key files generated in the JSON format.
  5. Give the newly created service user access to every GCP project you would like Transcend to scan.

    • For each project, navigate to the IAM section and select + Add to add a user for the project.
    • Enter the email address of the service user and select Owner permissions.
    • Save the permissions and repeat for each additional project desired.

Allowlist the Service Account

Once a dedicated service account is provisioned, the next step is to give it access to call the appropriate APIs in the Google organization.

  1. Go to your organization's Google Admin Console
  2. From the navigation menu, select Security > Access and data controls > API Controls.
  3. Select Manage Domain Wide Delegation.
  4. Add a new "API Client", and in the form enter the Client ID of the Service Account noted in Step 3 of the previous section.
  5. Add the following OAuth scope, and then click "Authorize": https://www.googleapis.com/auth/cloud-platform.read-only.

Complete Transcend's Connection Form

To complete authentication for the integration, navigate back to the Transcend dashboard and enter the following fields in the integration connection form:

  • Administrator Account Email Address
    • Email address of a user that can access Google Cloud resources and service usage. This is usually an admin or account owner.
  • Service Account Email Address
    • This is the for the service user that was created for the integration. It looks similar to gcp-project@gcp-project.iam.gserviceaccount.com.
  • Service Account Private Key
    • This comes from the JSON key downloaded in setup. The integration connection form does not take the entire JSON object in the file, only the value for private key. To obtain the private key:
      • Open the File in a text editor (TextEdit, VScode, etc.)
      • Look for the private key field and copy everything between the quotes.
      • The key needs to be formatted before pasted into the connection form. The key itself is formatted with line breaks defined by \n. The easiest way to format this correctly is to copy the key into a new editor and do a "find & replace", where the “replace” value is an Enter or Return.
      • Copy the formatted key value into the conenection form.

Connect the integration.

Once the integration is authenticated, navigate to the Configuration tab and enable the data silo discovery plugin to programmatically discover the GCP resources used across projects in your organization's account. The plugin is specifically looking for data storage systems like databases, data warehouses and object/file storage systems.

Once the scan is complete, select View Data Inventory to review and approve the discovered GCP resources.

Configure GCP Integration in the Transcend Connection form

The discovered resources are available for review by selecting X Resources Found. From there, review each service to decide if it should be approved as a data silo. Resources can be configured for content classification and privacy requests after they have been approved.

Resources discvoered by the GCP plugin and recommended as data silos

Once a discovered data silo has been approved and added to Data Inventory, it can be configured to further scan the individual resources to identify and classify information stored within. This is particularly valuable for databases and data storage systems, where Content Classification can programmatically identify datapoints, provide classification recommendations and identify personal data. To enable content classification for a resource, simply navigate to the Configuration tab of desired data silo and enable the Datapoint Discovery plugin.

Enable datapoint discovery for content classification