Sombra Glossary

This glossary provides definitions for key terminology and concepts related to Sombra and the broader Transcend ecosystem. It focuses on Transcend-specific terms that may not be commonly known to DevOps engineers.

A deployment option for Sombra using Amazon's Elastic Container Service (ECS). While previously recommended, it is now considered less optimal than Kubernetes-based deployments as it lacks GPU support needed for the LLM Classifier and provides less flexibility for scaling and resource management.

A service that enables secure, private communication between your self-hosted Sombra deployment and Transcend's backend services without using the public internet. It creates a private endpoint in your VPC that connects directly to Transcend's services.

A feature in Sombra that efficiently manages database connections by maintaining a pool of connections for each database endpoint. This prevents unbounded connections to upstream databases and improves performance.

An open-source client-side zip-streaming technology built by Transcend that takes multiple file streams as input and outputs a single .zip file stream. Conflux works with Penumbra to enable streaming of encrypted data without buffering entire files in memory.

A system or service that contains personal data, such as a database, SaaS application, or file storage system. Transcend connects to Data Silos to discover, catalog, and process personal data.

A request made by an individual (data subject) to access, delete, or modify their personal data as protected under privacy regulations like GDPR, CCPA, and others. Transcend automates the fulfillment of these requests across all connected data systems.

The process by which end-users prove their identity to Sombra when interacting with the Privacy Center. Sombra supports multiple authentication methods including OAuth 2.0 and JWT Magic Links.

A networking configuration option for Sombra where the Transcend backend sends traffic directly to Sombra. This requires exposing Sombra to the internet and configuring firewall ingress rules, TLS, and load balancers. Less commonly used than the Reverse Tunnel method.

A basic method for running Sombra using direct docker run commands. While simple to set up, this approach lacks automatic container restart capabilities and other production features, making it suitable primarily for testing and evaluation rather than production environments.

The encryption methodology used by Sombra where data is encrypted before it leaves your network and is only decrypted on the authorized user's device. Transcend never has access to unencrypted data in this architecture.

The process by which administrators and staff at your company prove their identity to Sombra when using the Admin Dashboard. Sombra supports SAML 2.0 single sign-on (SSO) and Transcend authentication methods.

A configuration option that allows Sombra to fetch secrets like encryption keys, certificates, and passphrases directly from HashiCorp Vault. This enables more secure management of sensitive credentials without storing them as environment variables.

The recommended production deployment method for Sombra, using Helm charts with Kubernetes. This approach provides cloud-agnostic support, enhanced scaling capabilities, and support for the LLM Classifier through GPU-enabled nodes. It offers greater flexibility and maintainability compared to other deployment options.

A symmetric key used to authenticate your internal applications to the Sombra API. This key is typically sent as a Bearer token in the x-sombra-authentication header for requests to Sombra's internal endpoints.

An authentication method for data subjects that uses JSON Web Tokens (JWT) in email magic links to verify a user's identity. This method requires configuring JWT_AUTHENTICATION_PUBLIC_KEY in Sombra and JWT_AUTHENTICATION_PRIVATE_KEY in your signing server.

A critical environment variable in Sombra that contains the JSON Web Token asymmetric key(s) for signing Sombra payloads. It's also used for key derivation when using the local KMS provider. This key should be kept secure and can be rotated periodically.

The system used by Sombra to manage encryption keys. Sombra can use its own internal KMS implementation (local) or integrate with cloud provider KMS services like AWS KMS to generate and manage keys on hardware security modules (HSMs).

The security practice of periodically changing encryption keys used by Sombra. Transcend allows rotation of both the INTERNAL_KEY and JWT_ECDSA_KEY to enhance security. Sombra retains the four most recent sets of keys to maintain access to previously encrypted data.

The container orchestration platform used for the recommended Sombra deployment method. Kubernetes provides high availability, automatic scaling, and simplified management across different cloud environments or on-premises infrastructure. It supports GPU workloads needed for the LLM Classifier.

An optional component that can be deployed alongside Sombra, which uses a large language model to classify data found in your systems. It requires GPU access and helps Transcend Discovery products identify personal and sensitive data.

A feature that enables Sombra to send its logs to Transcend's servers for troubleshooting purposes. This is configured using environment variables like LOG_HTTP_TRANSPORT_URL and can be turned off once issues are resolved to maintain data privacy.

An authentication method for data subjects that enables them to authenticate using your existing login system. This method requires configuring various environment variables related to obtaining access tokens and user information.

An open-source client-side decryption streaming technology built by Transcend that operates in the browser to decrypt data encrypted by Sombra. Penumbra is the frontend counterpart to Sombra's backend encryption, forming the "end" in end-to-end encryption.

A user-facing web interface provided by Transcend where your end-users can initiate data subject access requests (DSARs), manage their privacy preferences, and download their encrypted data. End-users use Penumbra to decrypt their data exports.

The capability to operate Transcend components entirely within specific geographic regions. Self-hosted Sombra can be configured to communicate with Transcend's backend in a specific region (EU or US) using the TRANSCEND_URL environment variable.

The recommended networking configuration for Sombra that uses a secure tunnel to allow Transcend Cloud to communicate with Sombra in your private network without exposing Sombra to the internet. This simplifies setup and enhances security.

An authentication protocol used by Sombra to verify the identity of administrators and staff. This allows your employees to use your existing identity provider (like Okta or Azure AD) to access the Transcend Admin Dashboard.

A deployment option where Sombra is hosted within your network or VPC, providing the highest level of security since no unencrypted data ever leaves your infrastructure and you maintain control of all encryption keys.

The configurable time period for which user authentication sessions remain valid in Sombra. This can be set separately for employees (EMPLOYEE_SESSION_EXPIRY_TIME) and data subjects (DATA_SUBJECT_SESSION_EXPIRY_TIME) to comply with security requirements.

The cryptographic process by which Sombra establishes a secure communication channel with clients using a Diffie-Hellman Key Exchange. This generates a shared key that allows encrypted communication where Transcend cannot access the unencrypted data.

A self-hostable security gateway developed by Transcend that acts as a reverse proxy, encrypting data before it leaves your infrastructure. Sombra manages API keys, authentication, and encryption to ensure Transcend cannot access your unencrypted data or directly connect to your business systems.

An HTTP API hosted by Sombra that serves as an API gateway to the Transcend Cloud. It encrypts customer data before it enters the Transcend infrastructure and can be used for programmatic interactions with Transcend services.

A deployment of one or more Sombra nodes working together to handle requests. Multiple Sombra clusters can be deployed across different VPCs, cloud providers, or hybrid environments.

The server component of Sombra that handles all communications coming inbound from your internal servers. This server hosts an API for your services to make requests to Transcend, with traffic being restricted to allowed IP ranges and authenticated with API keys.

The various methods for deploying Sombra, including Helm charts for Kubernetes, Docker containers, AWS ECS, and cloud marketplace solutions. Each option provides different levels of customization and automation to fit various infrastructure requirements.

A unique identifier for a Sombra cluster that is used to associate specific data silos with that cluster. When using multiple Sombra clusters, each needs its own Sombra ID.

A special API key used by Sombra to establish a secure tunnel connection to Transcend's backend when using the Reverse Tunnel networking method. This key is generated when creating a new Self-Hosted Sombra in the Admin Dashboard.

The set of environment variables (SOMBRA_TLS_CERT, SOMBRA_TLS_KEY, SOMBRA_TLS_KEY_PASSPHRASE) used to configure secure TLS connections when using the Direct Connection method. These are not required when using the Reverse Tunnel method.

A web interface used by your team to manage privacy operations, view encrypted data samples, and administer Transcend services. Like the Privacy Center, it utilizes Penumbra for client-side decryption.

The hosted portion of Transcend's infrastructure that stores encrypted data and manages operations but cannot access your unencrypted data when using a self-hosted Sombra.

Collective term for Transcend's data discovery solutions, including:

  • Structured Discovery: Discovers and classifies personal data in structured data sources like databases
  • Unstructured Discovery: Identifies personal data in unstructured content like documents and files
  • Silo Discovery: Helps identify previously unknown data silos in your organization

A product that automates the fulfillment of data subject access requests across your entire data ecosystem, using Sombra to securely connect to your systems without exposing data to Transcend.

A deployment option where Transcend hosts and manages Sombra within its cloud infrastructure. While this provides a full SaaS solution, it means Transcend technically has the means (but not the permission or ability under normal circumstances) to decrypt your data.