Sombra Overview
Transcend is extraordinarily connected across your data stack, and that must come with extraordinary security measures. Rather than ask you to simply trust Transcend with your business data, we use a secure-by-design architecture which ensures that Transcend cannot see your data, does not have direct access to your systems, and cannot arbitrarily operate on your data.
In short: Sombra is the part of Transcend which actually connects to your data—it's a containerized application which scans your data and operates on it. Sombra can be self-hosted by you. If you self-host Sombra, Transcend's Cloud does not have access to your data systems at all.
Data is end-to-end encrypted between your business systems, your company admins, and where applicable, your end-users. By design, it is not possible for Transcend to see your data.
Before data arrives at Transcend's API, it is encrypted by a gateway called Sombra. When data is viewed by a user, it is decrypted on their device by our embedded client-side decryption technology called Penumbra.
Crucially, because data is encrypted by Sombra and decrypted client-side, Transcend never has access to unencrypted data. Furthermore, Transcend cannot directly query data from your business system—Transcend can only request that Sombra make the request on Transcend's behalf, and Sombra only responds with encrypted data.
Let's show an example: as part of fulfilling a data subject access request, Transcend is exporting someone's personal data from your SaaS tool. In this example, Transcend is exporting this person's support tickets from the Zendesk API (using the Transcend + Zendesk integration). Since only Sombra has your Zendesk API key, Transcend cannot make a request directly to Zendesk's API, and must reverse-proxy an HTTP request to Zendesk's API through Sombra. After authorizing Transcend's request, Sombra rewrites the HTTP request with the Zendesk API key, and makes the request to Zendesk on Transcend's behalf.
When the Zendesk API responds, Sombra generates a content encryption key, encrypts the data, and responds to Transcend with encrypted data.
This method generalizes to SQL queries to your databases (using Transcend + database integrations), webhooks to your systems, and network requests to any API. This method is used across both Transcend Structured Discovery, which catalogs and classifies the data inside your data systems, and Transcend DSR Automation, which operates on individual's data across your data stack.
In our security architecture, Sombra treats the Transcend API as an untrusted third party. This is analogous to other end-to-end encryption frameworks, such as E2EE messaging apps like Signal: while information may be transferred over third-party communications infrastructure, the data is immune to monitoring and tampering. This architecture extends beyond that confidentiality and integrity of data transfers to include commands to your systems.
Commands to your systems (such as to delete someone's personal data across your stack) must be authorized and authentic. For example, Transcend cannot choose to delete a user from your system. Sombra does not trust Transcend's instructions to delete an individual; instead, Sombra expects a cryptographically-signed token from the individual making the request to delete their data. Select admins in your organization (such as someone in your legal department) may also sign these tokens on behalf of individual users.
If this sounds complex, don't worry! All this cryptography work is abstracted behind seamless web interfaces—using Transcend feels like using any other web application.
Personal data is between you and your users. Data is encrypted with AES-256 before it leaves your firewall, and only decrypted on an authorized user's device—Transcend never sees it.
Transcend's backend does not have access to your API keys, so it cannot connect to your business systems directly. Transcend connects to your business systems through Sombra. Sombra, and not Transcend, manages the access keys to your business systems. Sombra has a key management system built in, and can optionally delegate key management to another of your key management systems (like AWS KMS) to generate and manage keys on hardware security modules (HSMs).
Operations on personal data must be associated with an authentic request. For example, to operate on an individual's data, Sombra requires a cryptographic proof that the relevant user requested that operation themselves after proving their identity. An admin on your team may also request that operation on the user's behalf, and they must also prove their identity. It is not possible for Transcend to delete someone else's data.
Be certain that a requestor is who they claim they are. Sombra can verify your end-users' identities with OAuth 2, JWT Magic Links, and more. Admins in your organization can also authenticate themselves to Sombra using SAML / single sign on. By having Sombra perform authentication, Transcend cannot escalate its privileges (e.g., spoof a user or an admin making a data access request).
Sombra is a gateway that sits in front of Transcend's API and encrypts your data before it arrives at Transcend. That way, the data is not visible to Transcend. Sombra is a simple Node.js server that proxies requests, and has no storage requirements. You can choose between two possible architectures involving Sombra: self-hosted (within your firewall) and Transcend-hosted (within a Transcend-owned firewall).
The self-hosted architecture guarantees the highest level of security, since hosting Sombra within your firewall ensures no unencrypted data is ever available to Transcend, and you keep your keys in-house. By hosting Sombra yourself, the data can only be decrypted by you and by the user requesting their data; Transcend never has access to your keys (the source is available for review so that you may verify these properties). Under this architecture, Sombra sits in your network’s DMZ and can optionally use your key management service such as AWS KMS to generate and manage data keys on hardware security modules (HSMs).
If you prefer a full SaaS solution, you can use the Transcend-hosted Sombra, which comes ready out of the box with Transcend. Under this architecture, your data is still encrypted in the Transcend backend, as we still route all incoming requests through Sombra, but in this case Sombra is hosted in a Transcend-managed cloud. In this case, the keys are stored by Transcend’s KMS, meaning we have the means to decrypt the data sent to us. As a strict company policy, Transcend never will, and there are strong technical measures in place to prevent this.
Whether you use the Transcend-hosted version of Sombra or host it yourself, the API is the same.
As Sombra is the encryption module on the backend, Penumbra is the decryption module on the frontend. Sombra and Penumbra form each "end" in the "end-to-end encryption" architecture.
Our web interfaces are largely powered by encrypted data. Since Transcend's backend servers only have encrypted copies of your data, we cannot serve unencrypted data to a user interface. Instead, we serve encrypted data which can be decrypted on a client device by Penumbra. Specifically, Penumbra is a decryption technology which operates on a background thread in the browser's runtime.
In cases where a user needs to view unencrypted data (and they have permission to do so) the user can decrypt data on their device using a decryption key. To fetch this decryption key, the user must be authenticated and have the right privileges. A user verifies their identity through a seamless web-based authentication flow (such as account login), Penumbra forms a secure channel with Sombra to pass the user's authentication information, and Sombra attempts to verify the user. If Sombra successfully verifies the user (and if they have permission to retrieve the requested decryption key), Sombra responds to Penumbra with the decryption key, and Penumbra uses it to decrypt the data.
All of this happens seamlessly through our web interfaces. To a user, there is no visual difference between a Transcend interface using encrypted data and a typical web interface—it's as if the data was served normally.
Transcend's Admin Dashboard (used by you and your team) and the Privacy Center (used by your end-users) both have Penumbra under the hood. Note: depending on the Transcend products you use, the Privacy Center may not be applicable to your Transcend implementation—the Privacy Center is part of the Transcend DSR Automation product.
- In the Admin Dashboard, admins can be given permission to decrypt data. For example, an admin can decrypt samples of real data in Transcend Structured Discovery, or the content of a data export associated with a data subject access request in Transcend DSR Automation.
- In the Privacy Center, end-users requesting access to their data have, of course, permission to decrypt their own data export.
To make all of this possible, Transcend Engineering built and open-sourced Penumbra, the first client-side decryption streaming technology. Like Sombra, Penumbra also streams all content, which means data never has to fully buffer into memory. Since Transcend's E2EE stack purely streams data, hardware memory is not a constraint, and any-sized payload can be transferred with end-to-end encryption.
Once Penumbra has begun decrypting data, it can stream the unencrypted output for display in a web interface (e.g., preview data in the Admin Dashboard), or download the data to disk. Since many exports include several files, Transcend Engineering also built and open-sourced Conflux, the first client-side zip-streaming technology, which takes many file streams as input, and outputs one .zip file stream.
- If you're self-hosting Sombra, follow this guide to deploy Sombra in minutes.
- If Transcend is managing Sombra for you, there is no configuration required from you.
- If you're curious about how Transcend offers a seamless web experience powered by encrypted data, check out Penumbra and Conflux on GitHub.