Google Bigtable Set Up for DSR Automation

Just like any other database integration, with Transcend's Google Bigtable integration, you can automate GoogleSQL query execution during preflight enrichment using the Database Enricher. You can also fulfill Data Subject Requests on a Google Bigtable instance by running Google Bigtable mutations with our custom JSON payload for the desired data actions on each datapoint.

The first step to setting up DSRs against a Bigtable Instance is creating the datapoints in the data system that should be queried. We typically recommend creating a datapoint for each table in the instance that stores personal data (or any tables you want to execute DSRs against). For example, let's say there is a table called Chat History that contains all the messages sent back and forth from a customer. You could create a datapoint for Chat History in the data system and enable the specific data actions needed. If you're using Structured Discovery, you can enable the Datapoint Schema Discovery plugin to create the datapoints for you automatically. The subdatapoints discovered will be in the format columnFamily::column, or columnFamily::column.subfield if we discover JSON fields.

Create a datapoint in the Google Bigtable Integration

For each data action enabled for a datapoint in the Google Bigtable data system, you can define a JSON payload that will execute a database mutation. Using the previous Chat History example, let's say you want to enable the Chat History datapoint to support access/right to know requests. With the “access” data action enabled, you can define a specific JSON payload that executes the request to find the Chat History for a user against the database. The next sections outline how to construct the types of queries that are supported, how to construct the custom payload, and outlines sample queries.

Define a custom query for a data action on a datapoint

Transcend’s Google Bigtable integration supports the mutation types that can be used to create different types of DSRs a data subject can make (access/right to know, erasure, opt-out of communication, etc.). For example, an Access Data Request could be actioned with a read mutation type, and an Erasure Data Request could be actioned with a delete mutation type. The full list of supported mutation types is listed below:

  • read
  • insert
  • delete

As mentioned above, the Google Bigtable integration uses a custom query payload to execute operations on the database. The parameters to be included in the payload are described below. Please note that all parameters, except for mutations, are required and must be included in the payload.

  • table: the name of the table to perform the mutations on
  • columnFamily: the column family that contains the field to filter on
  • column: the column that contains the data to filter on
  • value: the value to compare, often the identifier, using a case-insensitive comparison
  • mutations: a list of mutations to execute on the table. If none are provided, it will perform a read operation. Each mutation will contain these parameters:
    • method: the mutation type to perform
    • data: fields to perform the mutation
      • Note that the data format will be different depending on the operation type. Please refer to the following examples.

​Below is an example of a custom payload for each query type

read

Sample Payload

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob"
}

insert

Sample Payload

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob",
  "mutations": [
    {
      "method": "insert",
      "data": {
        "user": { "status": "active" }
      }
    }
  ]
}

insert multiple values

Sample Payload

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob",
  "mutations": [
    {
      "method": "insert",
      "data": {
        "user": { "status": "active" },
        "user": { "verified": "true" }
      }
    }
  ]
}

Pro tip: insert will add a new version to the cell. To rewrite a cell, consider also using a delete

delete and insert

Sample Payload ​

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob",
  "mutations":
  [
    {
      "method": "delete",
      "data": ["user:status"]
    }
    {
      "method": "insert",
      "data": { "user": { "status": "active" } }
    }
  ]
}

delete multiple columns

Sample Payload ​

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob",
  "mutations": [
    {
      "method": "delete",
      "data": ["user:status", "user:verified", "logs"]
    }
  ]
}

delete entire rows

Sample Payload ​

{
  "table": "chat-history",
  "columnFamily": "user",
  "column": "username",
  "value": "bob",
  "mutations": [
    {
      "method": "delete"
    }
  ]
}

We also support a few template variables that can be used inside your mutation.

  • ? or {{identifier}} — This replaces the template with the identifier
  • {{requestId}} — This replaces the template variable with the ID of the request

To query for a user with a name identifier, you can replace the "value": "bob" in the JSON body with "value": ?, ommiting the surrounding quotation marks.

We also support GoogleSQL, a query language used by multiple Google Cloud services. More information can be found on the official documention on Google SQL for Bigtable

Preflight enrichment for Google Bigtable using the Database Enricher requires GoogleSQL, and it can also be used to execute SELECT statements for DSR's.

ex. to query for a user's email with a name identifier, use ? or {{identifier}} in the WHERE clause, and alias the column as email. The GoogleSQL string might look like:

SELECT contact['email'] AS email
FROM users
WHERE contact['name'] = 'bob'