MongoDB Integration

Transcend maintains an integration for MongoDB databases that supports Structured Discovery and DSR Automation functionality, allowing you to:

  • Scan your database to identify datapoints that contain personal information
  • Programmatically classify the data category and storage purpose of datapoints
  • Define and execute DSRs directly against your database

The first step to connecting a MongoDB database to Transcend is to add a MongoDB integration.

The database is connected with a URI connection string in the standard connection string format. If you are using MongoDB Atlas, you can use the instructions here to get a connection string. Alternatively, you can manually construct a connection string using this format.

Enter the connection string into the Integration Connection Form in Transcend and select Connect.

You can use the MongoDB integration to programmatically identify personal information in your MongoDB and pull it into Transcend as datapoints. Those datapoints are assigned a data category and processing purpose to help you classify internal data. Additionally, you can assign custom tags to your datapoints to allow for further classification, tracking and reporting. In this way, Transcend helps you discover, classify and label data in a MongoDB automatically to ensure your Data Inventory is current.

The Datapoint Schema Discovery plugin in the MongoDB integration allows you to programmatically scan your database to identify the pieces of data in your DB and pull them into Transcend as objects and properties. Once the data is in Transcend, they can be classified, labeled, and configured for DSRs.

The plugin operates by sampling the database and generating an object within Transcend for each identified collection. Sampling is done dynamically to handle larger collections, continuously halving the sample size and rerunning queries until successful which provides optimal performance even for extensive datasets. The plugin uses the $sample, $sort by _id, and $limit query to retrieve both the old and the newest documents and combine the results. Additionally, the plugin discovers embedded arrays within these collections, each of which is also returned as an object, prefixed with the name of the parent collection for clarity. This comprehensive scanning process ensures a thorough mapping of your database structure within Transcend.

Creating an object for each collection/embedded array creates an organization structure that mirrors the architecture of your data and keeps data grouped consistently. This makes it simple to keep track of the data hierarchy in Transcend, classify the data, and optionally implement DSRs against it.

Data SiloObjectProperty
MongoDB DatabaseCollections/Embedded ArraysProperties

The image below gives an example of schema discovery results with nested properties. For objects, each :: separates nested layers of array fields. For properties, we provide the full path to the field in MongoDB. The [] symbolizes an array field, and :: symbolizes an object field.

To enable the datapoint schema discovery plugin, navigate to the Structured Discovery tab within the MongoDB data silo and toggle the plugin on. From there, you'll be able to set the frequency for which the plugin will run to discover new objects and properties as they are added to the database. Note: We recommend scheduling the plugin to run at times when the load on the database is lightest.

Transcend's Structured Discovery tool automatically classifies the data discovered in your database. By leveraging machine learning techniques, we can categorize and recommend the processing purpose for each piece of data discovered. With Structured Discovery, Transcend helps you keep your Data Inventory up to date through inevitable database schema changes. Check out our full Structured Discovery guide for more information about how it works.

With the Transcend MongoDB integration, you can fulfill DSRs directly against a MongoDB database by running MongoDB operations with our custom JSON payload for the desired data actions on each datapoint.

The first step to setting up DSRs against a MongoDB database is creating the datapoints in the data silo that should be queried. We typically recommend creating a datapoint for each collection in the database that stores personal data (or any collections you want to action DSRs against). For example, let's say there is a collection called Chat History that contains all the messages sent back and forth from a customer. You could create a datapoint for Chat History in the data silo and enable the specific data actions needed. If you're using Structured Discovery, you can enable the Datapoint Schema Discovery plugin to create the datapoints for you automatically.

Pro tip: Check out the Transcend Terraform Provider for options on managing data silos and data points in code.

For each data action enabled for a datapoint in the MongoDB data silo, you can define a JSON payload that will execute a database operation. Using the previous Chat History example, let's say you want to enable the Chat History datapoint to support access/right to know requests. With the “access” data action enabled, you can define a specific JSON payload that executes the request to find the Chat History for a user against the database. The next sections outline how to construct the types of queries that are supported, how to construct the custom payload, and outlines sample queries.

Transcend’s MongoDB integration supports the query operation types that can be used to create different types of DSRs a data subject can make (access/right to know, erasure, opt-out of communication, etc.). For example, an Access Data Request could be actioned with a find or findOne query type. The full list of supported query types is listed below:

  • find
  • findOne
  • updateOne
  • updateMany
  • deleteOne
  • deleteMany
  • replaceOne

As mentioned above, the MongoDB integration uses a custom query payload to execute operations on the database. The parameters to be included in the payload are described below. Please note that all the parameters are required and must be included in the payload.

  • database: the name of the database to query
  • collection: the collection inside database to query
  • type: the type of query operation to execute (find, findOne, deleteOne, etc.)
  • payload: the custom JSON payload defining the specifics for the operation (different depending on operation)
    • Note that the payload will be different depending on the operation type. See below for specific examples.
Example Payloads & MongoDB operations

​Below is an example of a custom payload for each query type and the corresponding MongoDB operation that is executed.

find - collection.find()

Sample Payload

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "find",
  "payload": {
    "query": {
      "name": "Jane Doe"
    },
    "option": {
      "projection": {
        "_id": 0
      }
    }
  }
}

​ MongoDB operation

client
  .db('test-database')
  .collection('test-collection')
  .find({ name: 'Jane Doe' }, { projection: { _id: 0 } });

findOne - collection.findOne()

Sample Payload

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "findOne",
  "payload": {
    "query": {
      "name": "Jane Doe"
    },
    "option": {
      "projection": {
        "_id": 0
      }
    }
  }
}

​ MongoDB operation

client
  .db('test-database')
  .collection('test-collection')
  .findOne({ name: 'Jane Doe' }, { projection: { _id: 0 } });

updateOne - collection.updateOne()

Sample Payload ​

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "updateOne",
  "payload": {
    "filter": {
      "name": "Jane Doe"
    },
    "query": {
      "$set": { "name": "John Doe" }
    },
    "option": {
      "upsert": true
    }
  }
}

​ MongoDB operation

client
  .db('test-database')
  .collection('test-collection')
  .updateOne(
    { name: 'Jane Doe' },
    { $set: { name: 'John Doe' } },
    {
      upsert: true,
    }
  );

updateMany - collection.updateMany()

Sample Payload ​

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "updateMany",
  "payload": {
    "filter": {
      "name": "Jane Doe"
    },
    "query": {
      "$set": { "name": "John Doe" }
    },
    "option": {
      "upsert": true
    }
  }
}

MongoDB operation

client
  .db('test-database')
  .collection('test-collection')
  .updateMany(
    { name: 'Jane Doe' },
    { $set: { name: 'John Doe' } },
    {
      upsert: true,
    }
  );

deleteOne - collection.deleteOne()

Sample Payload

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "deleteOne",
  "payload": {
    "query": {
      "name": "Jane Doe"
    }
  }
}

​ MongoDB operation

client
  .db('test-database')
  .collection('test-collection')
  .deleteOne({ name: 'Jane Doe' });

deleteMany - collection.deleteMany()

Sample Payload

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "deleteMany",
  "payload": {
    "query": {
      "name": "Jane Doe"
    }
  }
}

​ MongoDB operation​

client
  .db('test-database')
  .collection('test-collection')
  .deleteMany({ name: 'Jane Doe' });

replaceOne - collection.replaceOne()

Sample Payload ​

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "replaceOne",
  "payload": {
    "filter": {
      "name": "Jane Doe"
    },
    "query": {
      "name": "John Doe"
    },
    "option": {
      "upsert": true
    }
  }
}

​ MongoDB operation​

client
  .db("test-database")
  .collection("test-collection")
  .replaceOne({ "name": "Jane Doe" }, { name: "John Doe" }, "option": {
     "upsert": true
  })

We also support a few template variables that can be used inside your query.

  • {{identifier}} — This replaces the template with the identifier
  • {{requestId}} — This replaces the template variable with the ID of the request

ex. to query for a user with a name identifier, then the query should look like,

{
  "database": "test-database",
  "collection": "test-collection",
  "type": "findOne",
  "payload": {
    "query": {
      "name": "{{identifier}}"
    },
    "option": {
      "projection": {
        "_id": 0
      }
    }
  }
}