Self-hosting Sombra

By self-hosting the security gateway (AKA "Sombra"), you will be able to ensure your customer data and API keys stay within your cloud. Learn more about how Sombra works here.

Hosting Sombra is easy. It's a simple, stateless Docker container configured through environment variables. If you use Terraform and AWS, you can use our Terraform Module. If you have a pre-defined pipeline in your stack for deploying Docker images, you can use that instead.

Check out our dedicated guides for Deploying Sombra on Kubernetes and our Deploying Sombra on AWS with Terraform.

Sombra is not a publicly available Docker image, so you need authentication to pull the image.

If you're using AWS, you can pull our image from ECR using AWS IAM permissions, which is our recommended way for the strongest security. To do so, put 829095311197.dkr.ecr.eu-west-1.amazonaws.com/sombra:<version_tag> as your image name, replacing <version_tag> with the version you want to deploy, or prod for the latest version.

Before pulling the Sombra image, please reach out to us with your AWS Account IDs so we can add them to our permissions list.

Note: You will need to log in to Docker using your AWS profile before you can pull the Docker image.

If you're deploying to any other cloud provider or an on-premise environment, you can pull our image from Transcend's private Docker registry using basic authentication.

First, please reach out to us and request permission to pull our Sombra image, and we will add your Transcend account to our permissions list.

To authenticate, you must first log in to our private registry:

docker login docker.transcend.io

You will be prompted to enter the basic auth credentials. The username will always be "Transcend" (this is case-sensitive), and the password will be any API Key for your organization within the Admin Dashboard (note: a scope is not required for the API key).

Once you've logged in, you may pull images by running:

docker pull docker.transcend.io/sombra:<version_tag>

Replace <version_tag> with either prod for the latest version, or the specific version you want to deploy. Check out Infrastructure -> Sombras -> Changelog to see the latest Docker version_tag.

The Sombra container is running two servers sombra-customer-ingress and sombra-transcend-ingress. Typically, each server is mapped to its own URL and load balancer. It is also possible to deploy a single URL used for both servers you can also map both servers to the same load balancer with a single URL, but different ports.

It is common to terminate TLS at the load balancer, and have the communication between the load balancer and Sombra servers happen over HTTP. Sombra support TLS connection between the load balancer and server, however, you will need to provide the certificate through environment variables, and you should consider a process for cycling this certificate. To read more about configuring TLS communication between the load balancer and servers, see the TLS Certificate section below.

This server handles all communication that come inbound from your internal servers. Here, your customer data is encrypted before it enters the Transcend cloud. Some common use cases include:

The load balancer that you connect to this server can be restricted to the IP ranges that need to communicate with it. When you launch into production, it is recommended to lock down this server to the minimal set of IP ranges. Additionally, you can configure bearer token (AKA "internal key") authentication for any communication with the gateway. See the Cycle your Keys section to read more about generating this bearer token.

  • Docker Container HTTP Port (terminating TLS at load balancer): 5039
  • Docker Container HTTPS Port (requires TLS Certificate): 5040
  • Load Balancer Port: 443

This server handles all communication that comes inbound from the Transcend cloud. Here, Transcend will make requests to your internal systems, the gateway will authenticate those requests, route the request to the correct system, and then encrypt any sensitive or personal data that is returned by that system. Some examples of requests that will be initiated to this server include:

  • Proxy webhooks to your internal services.
  • Proxy SQL statements to your databases.
  • Proxy network requests to your SaaS services.
  • Encrypting or decrypting customer data.
  • Generating encryption keys for DSRs.
  • Proxy and encrypt email communications.
  • Authenticating data subjects and employees.
  • Most operations from the Privacy Center and Admin Dashboard indirectly rely on this server.

The load balancer that you connect to this server should be restricted to the Transcend IP ranges. In addition to IP restriction, all requests from Transcend will be signed, and the gateway will verify the Transcend signature on every request.

  • Docker Container HTTP Port (terminating TLS at load balancer): 5042
  • Docker Container HTTPS Port (requires TLS Certificate): 5041
  • 2 Load Balancer Setup Port: 443
  • Single Load Balancer Setup Port: 5041

We recommend using at least 2 CPUs or vCPUs, and at least 2 GB of RAM per CPU, like a t3.small on AWS. For companies with infrequent data requests, instances that support burst performance are a good choice.

Sombra should be deployed in a cluster with at least 2 containers running. Sombra needs to be highly available, so it's important to have redundancy. If any of the following criteria are true about your business, you may want to consider deploying more replicas:

  • If you have more than 10 discovery plugins.
  • You are processing 10k+ DSRs per month.

If you expect to receive many DSRs, and you are returning large data exports to your users (such as video libraries), then you may want to optimize your hardware for encryption speed. Sombra's encryption algorithm (AES-256-GCM) runs fastest on processors that support AES-NI. Since Sombra streams data, and never buffers whole files into memory, you do not need to worry about increasing your RAM in this case.

Transcend will always communicate with Sombra over TLS, however, you have the option to terminate TLS at the load balancer, or continue to use TLS within your cloud. Sombra can support both HTTPS and HTTP communication to your internal systems. See the section below on Ports to see how you can configure which ports map to HTTPS vs HTTP. If you decide to continue using TLS after the load balancer, you will need to give Sombra a TLS Certificate through environment variables.

We recommend following your company's standards here. If you normally terminate TLS for other services, you should do that here as well. If you maintain TLS between services, you should configure Sombra to do so as well.

If you are using a proxy like Cloudflare, please ensure that you have a sufficiently large limit to messages that can go through the gateway. It is possible that you may need to upload files that are a few gigabytes. If possible, you should allow for ingress and egress file size limits to be over 2gb. If you need a lower limit, starting at 200mb should be safe for most situations.

If you have a Transcend Sandbox, you have the option of configuring a single Sombra gateway that can be used in both environments, or deploy two separate Sombra gateways: one in your sandbox environment, and one in production. The choice is yours, but you must coordinate this setup with your Transcend account representative.

See this guide to learn what regionalization options are available and how to configure them.

The first thing you'll want to do is decide where you plan to host the gateway. You will need two URLs for both sombra-customer-ingress and sombra-transcend-ingress. For example, this may look something like:

  • sombra-transcend-ingress: https://sombra-transcend.acme.com
  • sombra-customer-ingress: https://sombra-internal.acme.com

Once you decide on these subdomain names, you will want to create a new Sombra gateway under Infrastructure/Sombra by clicking the "Create New Self Hosted Sombra" button.

Once you've decided on your subdomains, you'll need to attach each domain to a load balancer, and each load balancer will attach to a different port on the Sombra gateway. See Ports for more details.

The follow example can be used as a minimum viable configuration setting for testing that your Sombra gateway can boot up properly.

# This value can be found under "Sombra Audience" here: https://app.transcend.io/infrastructure/sombra/sombras
ORGANIZATION_URI=<FILL-ME>
# This value can be found under "ID" here: https://app.transcend.io/infrastructure/sombra/sombras
# NOTE: This is NOT the Multi Tenant Sombra, but the Self Hosted Sombra you created in Step 1.
SOMBRA_ID=<FILL-ME>

# The root secrets that you should generate yourself and keep secret
# See the section below for information on how to generate these values
JWT_ECDSA_KEY="LS0tLS1CRUdJTiBFQyBQUklWQVRFIEtFWS0tLS0tCk1JR2tBZ0VCQkRCT0JkNExXVzNaTkJXOWhyTUJ4YlJUemx0SjZjWitIMm5GM3FybDgwdnpLbG1yMnFkRzU5YTUKOU1vWTJhWTJYWVNnQndZRks0RUVBQ0toWkFOaUFBUTBQOUI5Nm9FaVZhWmo3RnhRWThtM1JaMnRRRkVNaUhaWgpKTXk0NjdBcEJiRFRJZkpHRWh3MjAvcnljS3gxY25CUzRqYk5rdTVLNHh0TlpSMDcwVHNFWkREVmh3Y3kxNWRkCktWaDJGcVZvczkxVjVCSVUyK0xENUpYUGUweUVtM1U9Ci0tLS0tRU5EIEVDIFBSSVZBVEUgS0VZLS0tLS0K"
INTERNAL_KEY_HASH="wm/mZTcSALaEibJXmhdq8g7lUN19kgXQ4hWgjt3woE8="

# We recommend starting with the 'transcend' authentication method
# - After Single Sign On is setup, 'transcend' can be switched to 'saml'
#   for the "EMPLOYEE_AUTHENTICATION_METHODS" arg.
# - After Account Login is setup, 'transcend' can be switched to 'oauth' or 'jwt'
#   for the "DATA_SUBJECT_AUTHENTICATION_METHODS" arg.
EMPLOYEE_AUTHENTICATION_METHODS=transcend,session
DATA_SUBJECT_AUTHENTICATION_METHODS=transcend,session

Note: The SOMBRA_ID parameter is only required when deploying multiple Sombra gateways. To understand more about whether you need multiple gateways and how to obtain the Sombra ID, see the section on using multiple Sombra Gateays.

If you are migrating from a Transcend-hosted multi-tenant Sombra to an on-premise and are mid-implementation, it is critical that you re-use the same JWT_ECDSA_KEY from the existing instance. If you already have connected integrations and DSRs, you should deploy the on premise Sombra gateway with the same JWT_ECDSA_KEY and then run a key rotation after the gateway is deployed.

To obtain the JWT_ECDSA_KEY reach out to your account manager over Slack or email support@transcend.io to grant access for you to download the key. You will only be able to reveal the key once for security reasons. The key can be obtained by clicking the Reveal Multi Tenant Root Secret button on the Sombra Gateways panel in the Admin UI.

This key should not be committed to code and should be provided to the gateway securely using a tool such as Hashicorp Vault or AWS Systems Manager Parameter Store.

When rotating keys, we recommend that you reach out to your Transcend support rep to help guide and monitor the process to avoid downtime.

You can now begin networking services together. Start by running the Docker container in your hosting/orchestration service of choice (Kubernetes, ECS, EC2, self-hosted serversâ€Ĥ) and attach the load balancers to their respective ports for HTTPS or HTTP traffic. Once this is complete, Sombra will attempt to "phone home" to notify the Transcend backend that Sombra is live. Sombra will attempt to make this connection for 2 minutes, after which your server will shut down. You can use the /health route on both ports as a health check test.

Debugging Tips

Clicking the "Test" button under Test Gateway Connection will make a GET request to the /test endpoint of the Sombra Gateway. If this fails, it is likely that the load balancer is not setup to receive inbound requests from our IP ranges. See the documentation on IP Allowlisting.

Clicking the "Test" button under Test Employee Authentication can be used to ensure that the same JWT_ECDSA_KEY is being shared across all Sombra Gateways. After clicking "Test" for the multi tenant gateway and logging in, clicking "Test" for the self hosted gateway should not trigger a re-login. If this is not the case, then this indicates that the `JWT_ECDSA_KEY`` is not the same across both gateways.

To disable Sombra from attempting to "phone home" to notify Transcend that Sombra is live, you may configure SINGLE_TENANT_SYNC_TIMEOUT=0.

You can verify that the "phone home" procedure succeeded by checking if the Sombra gateway version shows up in the Admin UI.

In order to test that the Sombra gateway is running and fully functional, you should ensure:

  1. The gateway is running for over 2 minutes without SINGLE_TENANT_SYNC_TIMEOUT=0 set.
  2. Set the new self hosted gateway as the primary Sombra Gateway
  3. Go to the Admin Dashboard and click New Request. After you submit the request, you should see some routes being hit in the Sombra gateway's logs. If the DSR has submitted successfully, then the gateway is working!

It is crucial that before you use this gateway on production data, you cycle the two secrets that the gateway needs to operate:

  • JWT_ECDSA_KEY: The JSON web token private key for asymmetrically signing Sombra payloads using an Elliptic Curve Digital Signature Algorithm (ECDSA) with curve P-384 (also known as secp384r1, or the ES384 algorithm).
  • INTERNAL_KEY_HASH: The hash of a randomly-generated API key that will be used to verify incoming requests from your internal systems.

You can use the following bash commands to generate new secrets locally.

JWT_ECDSA_KEY=$(openssl ecparam -genkey -name secp384r1 -noout | base64)
INTERNAL_KEY_BIN=$(openssl rand 32)
INTERNAL_KEY=$(echo -n "$INTERNAL_KEY_BIN" | base64)
INTERNAL_KEY_HASH=$(echo -n "$INTERNAL_KEY_BIN" | openssl dgst -binary -sha256 | openssl base64)
echo "set in your service environments: INTERNAL_KEY: $INTERNAL_KEY"

echo "set gateway environment: JWT_ECDSA_KEY: ${JWT_ECDSA_KEY}"
echo "set gateway environment: INTERNAL_KEY_HASH: $INTERNAL_KEY_HASH"

The INTERNAL_KEY output is a random 256-bit key which is used for API authentication for requests from your internal service to Sombra. Save this secret for later (you can always generate a new one and update the environment variables again). This should be passed as a Bearer token with your HTTP requests to Sombra, so you should securely store it and make it accessible to your internal services. To authenticate an internal service to Sombra, pass the header x-sombra-authorization`: `Bearer ${INTERNAL_KEY}.

At this point, you should have a functional Sombra gateway running! You can now begin connecting systems to Transcend, such as SaaS tools, and your API keys will be encrypted by your Sombra gateway—Transcend does not have access to these plaintext secrets. There are additional settings that you can configure into your Sombra gateway to enable certain features or authentication methods. You can read more about them below, or reach out to your Transcend representative to ask about a recommended configuration.

Sombra is responsible for authenticating your data subjects when they make requests. Sometimes, an administrator in your organization needs to assume the role of one of your data subjects. Currently, the following operations require this level of authentication:

  • submitting a DSR on behalf on a data subject
  • decrypting the data for a DSR that is in process

Sombra supports authenticating data subjects via OAuth and JWT magic links. You can use one or both, and they can be associated with different types of data subjects (such as registered users vs. guest users).

DATA_SUBJECT_AUTHENTICATION_METHODS: session,jwt, session,oauth, or session,jwt,oauth for both.

Read more about some of these settings in Identity Verification Concepts

Sombra supports SAML Single Sign On (SSO) for authentication of employees at your company who need to sign in to the Admin Dashboard. You can also let Transcend handle the authentication via email and password (not recommended).

EMPLOYEE_AUTHENTICATION_METHODS: session,saml, session,transcend, or session,saml,transcend for both.

Note: These variables are used for the Sombra instance's HTTPS server. If your Sombra instance is deployed behind a load balancer, and you don't want to continue using TLS after the load balancer (thereby using HTTP within your network perimeter), you can omit these environment variables.

  • SOMBRA_TLS_CERT: The TLS certificate for this server, base64-encoded in PEM format.
  • SOMBRA_TLS_KEY: The TLS private key for this server, base64-encoded in PEM format.
  • SOMBRA_TLS_KEY_PASSPHRASE: when generating your TLS cert, you may have been prompted to add a passphrase to your private key. If you did, you can add the passphrase here to let Sombra access it.

If you want to set these values via a Hashicorp Vault secret store and to have the values fetched dynamically at runtime, see our documentation here

JWT configuration is required if you configured jwt as one of the DATA_SUBJECT_AUTHENTICATION_METHODS. e.g. DATA_SUBJECT_AUTHENTICATION_METHODS=session,jwt

JWT_AUTHENTICATION_PUBLIC_KEY: the base64-encoded ES384 public key (this is the asymmetric Elliptic Curve Digital Signature Algorithm). Generate this key-pair by running the command below:

# Change into this directory
export WRITE_DIR=${WRITE_DIR:-"."}

PRIVATE_FILE="$WRITE_DIR/jwtES384-private-key.key"
PUBLIC_FILE="$WRITE_DIR/jwtES384.key.pub"

openssl ecparam -name secp384r1 -genkey -noout -out "$PRIVATE_FILE"
openssl ec -in "$PRIVATE_FILE" -pubout -out "$PUBLIC_FILE"
echo ""
echo "JWT_AUTHENTICATION_PUBLIC_KEY:"
< "$PUBLIC_FILE" base64

echo ""
echo "JWT_AUTHENTICATION_PRIVATE_KEY:"
< "$PRIVATE_FILE" base64

# Cleanup
rm "$PRIVATE_FILE"
rm "$PUBLIC_FILE"

The value for JWT_AUTHENTICATION_PUBLIC_KEY should be placed into the environment of Sombra. The value for JWT_AUTHENTICATION_PRIVATE_KEY should be placed in the environment of your server that will be signing the JWT payload.

OAuth configuration is required if you configured oauth as one of the DATA_SUBJECT_AUTHENTICATION_METHODS.

  • OAUTH_CLIENT_ID: The Client ID of your Privacy Center's OAuth 2 application
  • OAUTH_CLIENT_SECRET: The Client Secret of your Privacy Center's OAuth 2 application
  • OAUTH_GET_TOKEN_URL: The endpoint to make an Access Token Request, which resolves an authorization_code to an access_token
  • OAUTH_GET_TOKEN_BODY_GRANT_TYPE: (optional) The grant type for this OAuth token
  • OAUTH_GET_TOKEN_BODY_REDIRECT_URI: The redirect URI for a successful response from the authorization server
  • OAUTH_GET_TOKEN_METHOD: (optional) The HTTP method to use when retrieving the OAuth token. Defaults to POST
  • OAUTH_GET_TOKEN_HEADERS: (optional) The headers to use when retrieving the OAuth token
  • OAUTH_GET_CORE_ID_DATA_SUBJECT_TYPE: The type of Data Subject that the OAuth configuration refers to. This should use the Data Subject slug, which is the value found in parentheses under DSR Automation -> Request Settings -> Data Subjects.
  • OAUTH_GET_CORE_ID_URL: The API endpoint to retrieve the core identifier of a user. For example https://api.example.com/v1/user-profile. The core identifier is usually a User ID. It should be unique to the user, and it should never change.
  • OAUTH_GET_CORE_ID_PATH: The JSON path to extract the coreIdentifier (or user ID) from a JSON response body from OAUTH_GET_CORE_ID_URL. For example, if the JSON response is:
{
  "data": {
    "user": {
      "id": "123ade451b38283",
      "name": "Ben Farrell",
      "email": "benfarrell@gmail.com"
    },
    "activities": [{}]
  }
}

then OAUTH_GET_CORE_ID_PATH should be data.user.id.

  • OAUTH_GET_EMAIL_URL: The API endpoint to find the email of a user. For example https://api.example.com/v1/user-profile. (often the same as OAUTH_GET_CORE_ID_URL)
  • OAUTH_GET_EMAIL_PATH: The JSON path to extract the user's email address from a JSON response body from OAUTH_GET_EMAIL_URL. In the example above, this would be data.user.email
  • OAUTH_EMAIL_IS_VERIFIED: (optional) A boolean, indicating whether all user's emails have been verified by you previously. If this is true, Transcend will operate under the assumption that you have confirmed all that all email addresses belong to their respective logged-in users. If this varies by user, this operation can be done dynamically per user (see OAUTH_EMAIL_IS_VERIFIED_PATH). Verified emails can be used to discover data in other systems (such as marketing, sales, and other SaaS tools), so it's important that email addresses are verified as belonging to the logged-in user.
  • OAUTH_EMAIL_IS_VERIFIED_PATH: (optional) The JSON path to extract a boolean field in the JSON response body from OAUTH_GET_EMAIL_URL, indicating whether the email of this user has been verified by you previously. When this resolves to false, Transcend will send a verification email to confirm that the data subject has access to that email inbox. If this resolves to true, Transcend will skip that step. Verified emails can be used to discover data in other systems (such as marketing, sales, and other SaaS tools), so it's important that email addresses are verified as belonging to the logged-in user.
  • OAUTH_GET_PROFILE_PICTURE_URL: (optional) The API endpoint to retrieve a profile picture for the user. For example https://api.example.com/v1/user-profile. (often the same as OAUTH_GET_CORE_ID_URL or OAUTH_GET_EMAIL_URL). This displays their profile picture in the Privacy Center after a successful login.
  • OAUTH_GET_PROFILE_PICTURE_PATH: (optional) The JSON path to extract the profile picture URL for the user.

KMS_PROVIDER: one of AWS or local (in next release: GCP, Azure, IBM). If you want to use your own Key Management System (KMS), you can set this value to the hosting provider you're using and Sombra will integrate with your KMS for key management. Otherwise, Sombra will use its own internal KMS (local). Defaults to local.

By default, Sombra uses a runtime KMS implementation. Sombra uses key-derivation functions to generate the root KMS secret by deriving a new secret key from the JWT_ECDSA_KEY key.

If you want to set the key value via a Hashicorp Vault secret store and to have the value fetched dynamically at runtime, see our documentation here

You can create a new AWS KMS key for Sombra to use. See the AWS docs for more information on how to create an AWS KMS key. Note: in AWS this was previously named "customer master key (CMK)".

AWS_KMS_KEY_ARN: The AWS KMS key.

AWS KMS Setup:

  1. Create a KMS KEY in Amazon Key Management System.
    1. Click Create key.
    2. Give it an alias, such as "sombra-key", and a description such as "Used by Transcend's Sombra gateway to perform encryption on user data download requests".
    3. Click through to "Define key usage permissions" and check the new user you just created.
    4. Click through and click "Finish".
    5. Open your new key and copy the ARN into the AWS_KMS_KEY_ARN environment variable.

If you are using the module defined in this repo, it will create a key for you and automatically set its permissions.

To authenticate employees at your company who need to sign in to the Admin Dashboard and Sombra, they can use SAML Single Sign On to authenticate themselves to Sombra and the Admin Dashboard. To use SAML authentication, the following values must be set for the SAML configuration:

  • SAML_ENTRYPOINT: The login endpoint where the SAML assertion came form. This is needed so that SAML validation is to spec
  • SAML_CERT: The public key to validate the SAML assertion against
  • SAML_ISSUER: (optional) The issuer of the SAML certificate (defaults to transcend)
  • SAML_AUDIENCE: (optional) The audience of the SAML assertion (defaults to transcend)

For testing purposes:

  • ACCEPT_CLOCK_SKEWED_MS: Artificially skew the clock used to validate the expiration on the assertion (in ms)

Note: When setting value to session,saml, you are still able to invite other users into your Transcend account, they simply will not have the ability to take on the privilege of one of your data subjects. This can be useful if you want to invite one of your third-party vendors into your Transcend organization. You can configure their accounts on app.transcend.io to only have permissions to view their own integration. They will never be able to authenticate with Sombra, and they will never be able to view any of your customer data—including the customer data they upload.

  • INTERNAL_PORT_HTTPS: optional; defaults to 5040 the HTTPS port you want to listen on internally. Set to undefined if you only want to listen on HTTP.
  • INTERNAL_PORT_HTTP: optional; default to 5039 the HTTP port you want to listen on internally, if at all. Set to undefined if you only want to listen on HTTPS. At least one of the above environment variables must be set.
  • EXTERNAL_PORT_HTTPS: optional; defaults to 5041 the HTTPS port you want Transcend to communicate to. Set to undefined if you only want to listen on HTTP.
  • EXTERNAL_PORT_HTTP: optional; default to 5042 the HTTP port you want Transcend to communicate with. Set to undefined if you only want to listen on HTTPS. At least one of the above environment variables must be set.
  1. Install the Datadog agent as a sidecar container so that Sombra can forward traces and metrics over your configured ports.
  2. Set Datadog related following configurations.
  • RUN_DATADOG_APM: Initialize Datadog tracing. Default: true
  • DD_APM_PORT: Datadog Agent APM port, used for sending trace data. Default: 8126
  • DD_HOST: Datadog Agent stat, string prefix name for the stat. Default: localhost
  • DD_STATSD_PORT: Datadog Agent metric port, used for sending metrics data. Default: 8125
  • DD_APM_BLOCKLIST: A blocklist of routes to pass to the trace. See the docs. Default: []
  • DD_APM_ANALYTICS: Filter Analyzed Spans by user-defined tags. Default: true
  • DD_APM_LOG_INJECTION: Enable automatic injection of trace IDs in logs for supported logging libraries. Default: true
  • DD_APM_RUNTIME_METRICS: Whether to enable capturing runtime metrics. Port 8125 (or configured with DD_STATSD_PORT) must be opened on the Agent for UDP. Default: true
  • DD_TRACE_DEBUG: Enable debug logging in the tracer. Default: false

If you want to control how long data subjects or employees can keep an active session to Sombra, such as for PCI DSS or other compliance needs, you can do so with the following variables:

  • EMPLOYEE_SESSION_EXPIRY_TIME: Max session duration for employees as a string. Defaults to 3 hour
  • DATA_SUBJECT_SESSION_EXPIRY_TIME: Max session duration for data subjects as a string. Defaults to 5 days

Note that for employee authentication, not every action on the Admin Dashboard requires a sombra session. Sombra sessions are just necessary for things like creating new requests, viewing sample data from data discovery, viewing DSAR's contents, connecting an integration, etc.

The hosting costs of Sombra can vary depending on a few variables. Sombra is a simple web server hosted on a Dockerfile. It can be hosted in whatever framework you typically would use to run a Dockerfile.

Our out-of-the-box Terraform module for AWS has hosting costs around $170-$180/month which can be used as a benchmark. The image below is a breakdown of the bill that you would typically see for the AWS resources required to run the gateway:

Our out of the box Terraform module uses AWS Fargate to simplify the process of managing and deploying container-based images. This gateway will receive a fair amount of traffic that is initiated by your customers, your employees and also Transcend's integrations. For improved performance and availability, we recommend running 2 instances to add more resiliency when there may be spikes in traffic (e.g. privacy policy update announced in an email chain).

If you are looking to cut costs, we recommend experimenting with smaller instance sizes before removing replication of instances. If you have a small volume of DSRs or discovery plugins, you could try 1vCPU and 5GB to bring the bill down to around $90/month.

It's common to replace Fargate with another framework for managing Dockerfile deployment (e.g. Kubernetes, EC2s, Nomad, Docker Swarm, etc.). If you don't want to use our Terraform module with Fargate, our recommendation is to deploy the Dockerfile using whatever framework your team may use for internal Dockerfiles.

Load balancers will be needed for communication between Sombra and Transcend, as well as your internal systems and Sombra.

Data Usage costs are variable based on a number of things:

  1. Number of DSRs submitted
  2. Number of Data Silos in a DSR
  3. Number of discovery Plugins and the Frequency that the scans run at

The $2/month baseline is a reference point of around 100,000 DSRs per month, processed across 8 different systems with a couple discovery plugins. 100,000 DSRs per month is much higher than the average company. Although you may see fluctuation in your bill related to request volume, in most cases this cost is negligible compared to the total hosting costs.

The Sombra gateway produces detailed logs of any sensitive operations that happen within your organization, this includes:

  • Your Data Subjects submitting a DSR
  • Your Employees submitting a DSR
  • Your Data Subjects downloading their data
  • Your Employees previewing data in an Access request
  • Transcend sending a network request to your SaaS vendors for the purposes of fulfilling a DSR or Structured Discovery scan
  • Transcend running an employee-verified SQL statement against one of your databases to fulfill a DSR or for a Structured Discovery scan

The Dockerfile produces these logs in a standard JSON format and they can be ported into your log tool of choice. Transcend ports logging into Datadog which we use to get visibility into logs and set alerts & metrics. We use AWS FireLens to do this. The cost of those logs is dependent on which vendor you use. Price per log and log retention policies factor into the pricing, but for Transcend it ends up being around $5-10/month per Sombra cluster per month.

To prevent unbounded connections opening for connecting to upstream databases now we use a pool of connections for each database endpoint and store that pool in a LRU cache. Following are the configurations to mange the cache and pool sizes. To enable connection pooling you have to set the environment variable ODBC_POOL_CACHE_SIZE based on how many upstream Database servers being scanned plus some buffer to prevent any cache misses.

Environment VariableDescriptionDefaultRequired
ODBC_POOL_CACHE_SIZEMax size of LRU cache to store the connection pool per databaseN/AYes
ODBC_QUERY_TIMEOUT_IN_SECONDSHow long to wait for each ODBC query to execute before returning to the application60 secondsNo
ODBC_CONNECTION_TIMEOUT_IN_SECONDSThe number of seconds to wait for a request on the connection to complete before returning to the application0No
ODBC_LOGIN_TIMEOUT_IN_SECONDSThe number of seconds to wait for a log in request to complete before returning to the application10No
ODBC_CONNECTION_MAX_POOL_SIZEThe maximum number of open Connections the Pool will create100No
ODBC_CONNECTION_INITIAL_POOL_SIZEThe initial number of Connections created in the Pool10No
ODBC_CONNECTION_POOL_SIZE_INCREMENTHow many additional Connections to create when all of the Pool's connections are taken10No
REUSE_ODBC_CONNECTIONSWhether or not to reuse an existing Connection instead of creating a new onetrueNo
ODBC_CONNECTION_POOL_SIZE_SHRINKWhether or not the number of Connections should shrink to initialSize as they free uptrueNo

In the event you have an issue with Sombra that requires Transcend support be involved, Sombra has a configuration option that allows it to send its logs to Transcend's servers.

Please be aware that this option may expose encryption related metadata to Transcend, and we recommend turning this feature off once the issue has been resolved.

In order to use this feature, please set the LOG_HTTP_TRANSPORT_URL to https://collector.transcend.io/api/v1/logs, and ensure that your Sombra has synced with the Transcend backend using the "phone home" feature.

By default, the log transporter sends logs to Transcend in batches of 10 log lines, every 5 seconds. These values can be adjusted by setting the environment variables listed below.

Environment VariableDescriptionDefaultRequired
LOG_HTTP_TRANSPORT_URLThe Transcend Collector's HTTPS ingress endpoint.N/AYes
LOG_HTTP_TRANSPORT_BATCH_INTERVAL_MSThe maximum time to wait between batches of logs sent to the Collector.5000 millisecondsNo
LOG_HTTP_TRANSPORT_BATCH_COUNTThe maximum number of log lines to send in a single batched request.10No
DD_SERVICE_NAMEThe name for your Sombra.transcend-hosted-sombraNo

Most customers only need to deploy one Sombra instance. However, there are cases where it makes sense to deploy multiple instances:

  • If your company employs multiple VPCs that you want to keep separate, you may map one Sombra instance to each VPC. Alternatively, you could apply VPC Network Peering to give one Sombra instance acces to resources across multiple VPCs.
  • If there are multiple DB subnets within a VPC, you may deploy one Sombra instance to each subnet.
  • If your company uses multiple cloud providers, you may wish to deploy one Sombra instance in each cloud environment.

If you choose to deploy multiple gateways, you will need to specify which Sombra instance is the primary gateway in the Admin Dashboard. By default, your integrations will route through your primary Sombra instance.

For each self-hosted Sombra instance, you will need to grab the instance ID from the Admin Dashboard and set the SOMBRA_ID environment variable to it within the Docker container that is running Sombra. See section on configuring the container service with minimum set of environment variables above.

In the Admin UI, use the Associated Integrations dropdown in the Admin Dashboard to select which integrations should route to your non-primary Sombra instance(s).

Alternatively, you can specify which Sombra instance each integration should route through using the sombra_id parameter in Terraform. The value for sombra_id can be obtained in the Admin Dashboard.

NOTE: The environment variable JWT_ECDSA_KEY should be set to the same value for each Sombra instance.