Deploying your Self-Hosted Security Gateway

By self-hosting the security gateway (AKA "Sombra"), you will be able to ensure your customer data and API keys stay within your cloud. Learn more about how Sombra works here.

Hosting Sombra is easy. It's a simple, stateless Docker container configured through environment variables. If you use Terraform and AWS, you can use our Terraform Module. If you have a pre-defined pipeline in your stack for deploying Docker images, you can use that instead.

Check out our Terraform Module for Deploying Sombra on AWS.

Sombra is not a publicly available Docker image, so you need authentication to pull the image.

If you're using AWS, you can pull our image from ECR using AWS IAM permissions, which is our recommended way for the strongest security. To do so, put 829095311197.dkr.ecr.eu-west-1.amazonaws.com/sombra:<version_tag> as your image name, replacing <version_tag> with the version you want to deploy, or prod for the latest version.

Before pulling the Sombra image, please reach out to us with your AWS IAM Account IDs so we can add them to our permissions list.

If you're deploying to any other cloud provider or an on-premise environment, you can pull our image from Transcend's private Docker registry using basic authentication.

First, please reach out to us and request permission to pull our Sombra image, and we will add your Transcend account to our permissions list.

To authenticate, you must first log into our private registry:

docker login docker.transcend.io

You will be prompted to enter the basic auth credentials. The username will always be "Transcend" (this is case-sensitive), and the password will be any API Key for your organization within the Admin Dashboard (note: a scope is not required for the API key).

Once you've logged in, you may pull images by running:

docker pull docker.transcend.io/sombra:<version_tag>

Replace <version_tag> with either prod for the latest version, or the specific version you want to deploy.

The Sombra container is running two servers:

Port: 443

This port handles all communication coming from your internal systems. This server is responsible for encrypting any customer data before it hits the Transcend cloud. This port needs to be connected to a load balancer. You may restrict this load balancer to be accessible only within your firewall, or by the internal services that need to communicate with Transcend.

Port: 5041

This port handles all communications coming from Transcend's cloud. This is used for:

  1. Proxying Transcend's network requests to third-party software tools, and encrypting their response data before returning the data to Transcend.
  2. Proxying Transcend's webhooks to your internal services.
  3. Generating encryption keys for new privacy requests
  4. Handling Diffie-Hellman channels initiated by employees on the Admin Dashboard, or your users on the Privacy Center.

We recommend using at least 2 CPUs or vCPUs, and at least 2 GB of RAM per CPU, like a t3.small on AWS. For companies with infrequent data requests, instances that support burst performance are a good choice.

If you expect to receive many privacy requests, and you are returning large data exports to your users (such as video libraries), then you may want to optimize your hardware for encryption speed. Sombra's encryption algorithm (AES-256-GCM) runs fastest on processors that support AES-NI. Since Sombra streams data, and never buffers whole files into memory, you do not need to worry about increasing your RAM in this case.

Transcend will always communicate with Sombra over TLS, however, you have the option to terminate TLS at the load balancer, or continue to use TLS within your cloud. Sombra can support both HTTPS and HTTP communication to your internal systems. See the section below on Ports to see how you can configure which ports map to HTTPS vs HTTP. If you decide to continue using TLS after the load balancer, you will need to give Sombra a TLS Certificate through environment variables.

We recommend following your company's standards here. If you normally terminate TLS for other services, you should do that here as well. If you maintain TLS between services, you should configure Sombra to do so as well.

If you have a Transcend Sandbox, you have the option of configuring a single Sombra gateway that can be used in both environments, or deploy two separate Sombra gateways: one in your sandbox environment, and one in production. The choice is yours, but you must coordinate this setup with your Transcend account representative.

The first thing you'll want to do is decide where you plan to host the gateway. You will need two URLs for both sombra-customer-ingress and sombra-transcend-ingress. For example, this may look something like:

  • sombra-transcend-ingress: https://sombra-transcend.acme.com
  • sombra-customer-ingress: https://sombra-internal.acme.com

Once you decide on these subdomain names, please send them to your Transcend representative, and we will add these URLs to our allowlist.

Once you've decided on your subdomains, you'll need to attach each domain to a load balancer, and each load balancer will attach to a different port on the Sombra gateway. See Ports for more details.

The follow example can be used as a minimum viable configuration setting for testing that your Sombra gateway can boot up properly.

# This value can be found under "Sombra Audience" here: https://app.transcend.io/infrastructure/sombra
ORGANIZATION_URI=<FILL-ME>
# The root secrets that you should generate yourself and keep secret
# See the section below for information on how to generate these values
JWT_ECDSA_KEY="LS0tLS1CRUdJTiBFQyBQUklWQVRFIEtFWS0tLS0tCk1JR2tBZ0VCQkRCT0JkNExXVzNaTkJXOWhyTUJ4YlJUemx0SjZjWitIMm5GM3FybDgwdnpLbG1yMnFkRzU5YTUKOU1vWTJhWTJYWVNnQndZRks0RUVBQ0toWkFOaUFBUTBQOUI5Nm9FaVZhWmo3RnhRWThtM1JaMnRRRkVNaUhaWgpKTXk0NjdBcEJiRFRJZkpHRWh3MjAvcnljS3gxY25CUzRqYk5rdTVLNHh0TlpSMDcwVHNFWkREVmh3Y3kxNWRkCktWaDJGcVZvczkxVjVCSVUyK0xENUpYUGUweUVtM1U9Ci0tLS0tRU5EIEVDIFBSSVZBVEUgS0VZLS0tLS0K"
INTERNAL_KEY_HASH="wm/mZTcSALaEibJXmhdq8g7lUN19kgXQ4hWgjt3woE8="
# We recommend starting with the 'transcend' authentication method
# - After Single Sign On is setup, 'transcend' can be switched to 'saml'
# for the "EMPLOYEE_AUTHENTICATION_METHODS" arg.
# - After Account Login is setup, 'transcend' can be switched to 'oauth' or 'jwt'
# for the "DATA_SUBJECT_AUTHENTICATION_METHODS" arg.
EMPLOYEE_AUTHENTICATION_METHODS=transcend,session
DATA_SUBJECT_AUTHENTICATION_METHODS=transcend,session

You can now begin networking services together. Start by running the Docker container in your hosting/orchestration service of choice (Kubernetes, ECS, EC2, self-hosted servers…) and attach the load balancers to their respective ports for HTTPS or HTTP traffic. Once this is complete, Sombra will attempt to "phone home" to notify the Transcend backend that Sombra is live. Sombra will attempt to make this connection for 2 minutes, after which your server will shut down. You can use the /health route on both ports as a health check test.

If you want to disable this "phone home" feature, you may do so by configuring SINGLE_TENANT_SYNC_TIMEOUT=0

In order to test that the Sombra gateway is running and fully functional, you should ensure:

  1. The gateway is running for over 2 minutes without SINGLE_TENANT_SYNC_TIMEOUT=0 set.
  2. Go to the Admin Dashboard and click New Request. After you submit the request, you should see some routes being hit in the Sombra gateway's logs. If the privacy request has submitted successfully, then the gateway is working!

It is crucial that before you use this gateway on production data, you cycle the two secrets that the gateway needs to operate:

  • JWT_ECDSA_KEY: The JSON web token private key for asymmetrically signing Sombra payloads using an Elliptic Curve Digital Signature Algorithm (ECDSA) with curve P-384 (also known as secp384r1, or the ES384 algorithm).
  • INTERNAL_KEY_HASH: The hash of a randomly-generated API key that will be used to verify incoming requests from your internal systems.

You can use the following bash commands to generate new secrets locally.

JWT_ECDSA_KEY=$(openssl ecparam -genkey -name secp384r1 -noout | base64)
INTERNAL_KEY_BIN=$(openssl rand 32)
INTERNAL_KEY=$(echo -n "$INTERNAL_KEY_BIN" | base64)
INTERNAL_KEY_HASH=$(echo -n "$INTERNAL_KEY_BIN" | openssl dgst -binary -sha256 | openssl base64)
echo "set in your service environments: INTERNAL_KEY: $INTERNAL_KEY"
echo "set gateway environment: JWT_ECDSA_KEY: ${JWT_ECDSA_KEY}"
echo "set gateway environment: INTERNAL_KEY_HASH: $INTERNAL_KEY_HASH"

The INTERNAL_KEY output is a random 256-bit key which is used for API authentication for requests from your internal service to Sombra. Save this secret for later (you can always generate a new one and update the environment variables again). This should be passed as a Bearer token with your HTTP requests to Sombra, so you should securely store it and make it accessible to your internal services. To authenticate an internal service to Sombra, pass the header x-sombra-authorization`: `Bearer ${INTERNAL_KEY}.

At this point, you should have a functional Sombra gateway running! You can now begin connecting systems to Transcend, such as SaaS tools, and your API keys will be encrypted by your Sombra gateway—Transcend does not have access to these plaintext secrets. There are additional settings that you can configure into your Sombra gateway to enable certain features or authentication methods. You can read more about them below, or reach out to your Transcend representative to ask about a recommended configuration.

Sombra is responsible for authenticating your data subjects when they make requests. Sometimes, an administrator in your organization needs to assume the role of one of your data subjects. Currently, the following operations require this level of authentication:

  • submitting a DSR on behalf on a data subject
  • decrypting the data for a DSR that is in process

Sombra supports authenticating data subjects via OAuth and JWT magic links. You can use one or both, and they can be associated with different types of data subjects (such as registered users vs. guest users).

DATA_SUBJECT_AUTHENTICATION_METHODS: session,jwt, session,oauth, or session,jwt,oauth for both.

Read more about some of these settings in Identity Verification Concepts

Sombra supports SAML Single Sign On (SSO) for authentication of employees at your company who need to sign in to the Admin Dashboard. You can also let Transcend handle the authentication via email and password (not recommended).

EMPLOYEE_AUTHENTICATION_METHODS: session,saml, session,transcend, or session,saml,transcend for both.

Note: These variables are used for the Sombra instance's HTTPS server. If your Sombra instance is deployed behind a load balancer, and you don't want to continue using TLS after the load balancer (thereby using HTTP within your network perimeter), you can omit these environment variables.

  • SOMBRA_TLS_CERT: The TLS certificate for this server, base64-encoded in PEM format.
  • SOMBRA_TLS_KEY: The TLS private key for this server, base64-encoded in PEM format.
  • SOMBRA_TLS_KEY_PASSPHRASE: when generating your TLS cert, you may have been prompted to add a passphrase to your private key. If you did, you can add the passphrase here to let Sombra access it.

JWT configuration is required if you configured jwt as one of the DATA_SUBJECT_AUTHENTICATION_METHODS.

JWT_AUTHENTICATION_PUBLIC_KEY: the base64-encoded ES384 public key (this is the asymmetric Elliptic Curve Digital Signature Algorithm). Generate this key-pair by running bash gen_jwt_keys.sh. This is what Sombra uses to verify that you signed the JWT with an authenticated user's profile information.

OAuth configuration is required if you configured oauth as one of the DATA_SUBJECT_AUTHENTICATION_METHODS.

  • OAUTH_CLIENT_ID: The Client ID of your Privacy Center's OAuth 2 application
  • OAUTH_CLIENT_SECRET: The Client Secret of your Privacy Center's OAuth 2 application
  • OAUTH_GET_TOKEN_URL: The endpoint to make an Access Token Request, which resolves an authorization_code to an access_token
  • OAUTH_GET_TOKEN_BODY_GRANT_TYPE: (optional) The grant type for this OAuth token
  • OAUTH_GET_TOKEN_BODY_REDIRECT_URI: The redirect URI for a successful response from the authorization server
  • OAUTH_GET_TOKEN_METHOD: (optional) The HTTP method to use when retrieving the OAuth token. Defaults to POST
  • OAUTH_GET_TOKEN_HEADERS: (optional) The headers to use when retrieving the OAuth token
  • OAUTH_GET_CORE_ID_URL: The API endpoint to retrieve the core identifier of a user. For example https://api.example.com/v1/user-profile. The core identifier is usually a User ID. It should be unique to the user, and it should never change.
  • OAUTH_GET_CORE_ID_PATH: The JSON path to extract the coreIdentifier (or user ID) from a JSON response body from OAUTH_GET_CORE_ID_URL. For example, if the JSON response is:
{
"data": {
"user": {
"id": "123ade451b38283",
"name": "Ben Farrell",
"email": "benfarrell@gmail.com"
},
"activities": [{}]
}
}

then OAUTH_GET_CORE_ID_PATH should be data.user.id.

  • OAUTH_GET_EMAIL_URL: The API endpoint to find the email of a user. For example https://api.example.com/v1/user-profile. (often the same as OAUTH_GET_CORE_ID_URL)
  • OAUTH_GET_EMAIL_PATH: The JSON path to extract the user's email address from a JSON response body from OAUTH_GET_EMAIL_URL. In the example above, this would be data.user.email
  • OAUTH_EMAIL_IS_VERIFIED: (optional) A boolean, indicating whether all user's emails have been verified by you previously. If this is true, Transcend will operate under the assumption that you have confirmed all that all email addresses belong to their respective logged-in users. If this varies by user, this operation can be done dynamically per user (see OAUTH_EMAIL_IS_VERIFIED_PATH). Verified emails can be used to discover data in other systems (such as marketing, sales, and other SaaS tools), so it's important that email addresses are verified as belonging to the logged-in user.
  • OAUTH_EMAIL_IS_VERIFIED_PATH: (optional) The JSON path to extract a boolean field in the JSON response body from OAUTH_GET_EMAIL_URL, indicating whether the email of this user has been verified by you previously. When this resolves to false, Transcend will send a verification email to confirm that the data subject has access to that email inbox. If this resolves to true, Transcend will skip that step. Verified emails can be used to discover data in other systems (such as marketing, sales, and other SaaS tools), so it's important that email addresses are verified as belonging to the logged-in user.
  • OAUTH_GET_PROFILE_PICTURE_URL: (optional) The API endpoint to retrieve a profile picture for the user. For example https://api.example.com/v1/user-profile. (often the same as OAUTH_GET_CORE_ID_URL or OAUTH_GET_EMAIL_URL). This displays their profile picture in the Privacy Center after a successful login.
  • OAUTH_GET_PROFILE_PICTURE_PATH: (optional) The JSON path to extract the profile picture URL for the user.

KMS_PROVIDER: one of AWS or local (in next release: GCP, Azure, IBM). If you want to use your own Key Management System (KMS), you can set this value to the hosting provider you're using and Sombra will integrate with your KMS for key management. Otherwise, Sombra will use its own internal KMS (local). Defaults to local.

If you're not using KMS_PROVIDER and are using Sombra's built-in KMS, you should to set a key.

KEY_ENCRYPTION_BASE: A random 256-bit secret key used for content encryption key-derivation. You can generate a cryptographically secure random key by running openssl rand -base64 32. This must be base64-encoded. If omitted, Sombra will automatically generate and derive this key, using your JWT_ECDSA_KEY as a random seed (while this provides plenty of entropy, it's best practice to separate these keys).

You can create a new AWS KMS key for Sombra to use. See the AWS docs for more information on how to create an AWS KMS key. Note: in AWS this was previously named "customer master key (CMK)".

AWS_KMS_KEY_ARN: The AWS KMS key.

AWS KMS Setup:

  1. Create a KMS KEY in Amazon Key Management System.
    1. Click Create key.
    2. Give it an alias, such as "sombra-key", and a description such as "Used by Transcend's Sombra gateway to perform encryption on user data download requests".
    3. Click through to "Define key usage permissions" and check the new user you just created.
    4. Click through and click "Finish".
    5. Open your new key and copy the ARN into the AWS_KMS_KEY_ARN environment variable.

If you are using the module defined in this repo, it will create a key for you and automatically set its permissions.

To authenticate employees at your company who need to sign in to the Admin Dashboard and Sombra, they can use SAML Single Sign On to authenticate themselves to Sombra and the Admin Dashboard. To use SAML authentication, the following values must be set for the SAML configuration:

  • SAML_ENTRYPOINT: The login endpoint where the SAML assertion came form. This is needed so that SAML validation is to spec
  • SAML_CERT: The public key to validate the SAML assertion against
  • SAML_ISSUER: (optional) The issuer of the SAML certificate (defaults to transcend)
  • SAML_AUDIENCE: (optional) The audience of the SAML assertion (defaults to transcend)

For testing purposes:

  • ACCEPT_CLOCK_SKEWED_MS: Artificially skew the clock used to validate the expiration on the assertion (in ms)

Note: When setting value to session,saml, you are still able to invite other users into your Transcend account, they simply will not have the ability to take on the privilege of one of your data subjects. This can be useful if you want to invite one of your third-party vendors into your Transcend organization. You can configure their accounts on app.transcend.io to only have permissions to view their own integration. They will never be able to authenticate with Sombra, and they will never be able to view any of your customer data—including the customer data they upload.

  • INTERNAL_PORT_HTTPS: optional; defaults to 5040 the HTTPS port you want to listen on internally. Set to undefined if you only want to listen on HTTP.
  • INTERNAL_PORT_HTTP: optional; default to 5039 the HTTP port you want to listen on internally, if at all. Set to undefined if you only want to listen on HTTPS. At least one of the above environment variables must be set.
  • EXTERNAL_PORT_HTTPS: optional; defaults to 5041 the HTTPS port you want Transcend to communicate to. Set to undefined if you only want to listen on HTTP.
  • EXTERNAL_PORT_HTTP: optional; default to 5042 the HTTP port you want Transcend to communicate with. Set to undefined if you only want to listen on HTTPS. At least one of the above environment variables must be set.
  1. Install the Datadog agent as a sidecar container so that Sombra can forward traces and metrics over your configured ports.
  2. Set Datadog related following configurations.
  • RUN_DATADOG_APM: Initialize Datadog tracing. Default: true
  • DD_APM_PORT: Datadog Agent APM port, used for sending trace data. Default: 8126
  • DD_HOST: Datadog Agent stat, string prefix name for the stat. Default: localhost
  • DD_STATSD_PORT: Datadog Agent metric port, used for sending metrics data. Default: 8125
  • DD_APM_ANALYTICS: Filter Analyzed Spans by user-defined tags. Default: true
  • DD_APM_LOG_INJECTION: Enable automatic injection of trace IDs in logs for supported logging libraries. Default: true
  • DD_APM_RUNTIME_METRICS: Whether to enable capturing runtime metrics. Port 8125 (or configured with DD_STATSD_PORT) must be opened on the Agent for UDP. Default: true
  • DD_TRACE_DEBUG: Enable debug logging in the tracer. Default: false