Compute Resource Recommendations
Sombra instances are largely stateless, meaning they can easily be scaled both vertically and horizontally. Sombra is built to be resilient to downtime in many use cases such as discovery & classification and data privacy requests, though there are also cases like Preference Management where Sombra uptime is critical for proper handling of synchronous user requests.
This document covers the minimum requirements to get started with running Sombra, our recommended production configuration, autoscaling guidance, and detailed sizing recommendations based on which Transcend products you use.
The table below shows the bare minimum resources required for Sombra to start up and handle requests. These are suitable for development, testing, or very low-volume production environments.
| Component | vCPU | Memory | GPU | Notes |
|---|---|---|---|---|
| Sombra (core) | 2 | 4 GB | None | Handles all standard Transcend operations |
| LLM Classifier (optional) | 4 | 15 GB | 1x NVIDIA (A10G or equivalent) | Required for content classification and unstructured discovery. Deployed as a separate service. See LLM Classifier for hardware details. |
Important: The LLM Classifier is a separate service from Sombra and requires its own dedicated GPU-enabled compute. It is not required unless you are using content classification or unstructured discovery.
For most production deployments, we recommend the following as a safe starting point that will handle the majority of workloads without further tuning:
| Setting | Value |
|---|---|
| Memory per instance | 8 GB |
| vCPU per instance | 2–3 |
| Minimum instances | 2 (for high availability) |
| Maximum instances (autoscaling) | 10–20 |
| Scaling trigger | Request count per target, or memory utilization |
| Availability zones | 2+ (separate geographic locations recommended) |
This configuration provides:
- High availability through multiple instances across availability zones
- Headroom for traffic spikes without manual intervention
- Sufficient memory for large file processing (Google Drive, AWS S3, databases)
- Cost efficiency through autoscaling — you only pay for what you use
For a rough cost estimate, see Self-Hosting Costs. A baseline 2-instance Fargate deployment (2 vCPU, 5 GB each) runs approximately $150/month for compute alone.
Our recommended hosting options, such as our Kubernetes Helm chart or our AWS ECS Fargate Terraform module, include autoscaling configurations out of the box. You can always adjust the limits to match your needs.
Here is an example of an autoscaling configuration in our Sombra Helm chart.
| Use case | Primary scaling trigger | Secondary trigger | Notes |
|---|---|---|---|
| DSRs | Memory utilization (70%) | CPU utilization (70%) | Bursty during batch processing windows |
| Discovery & Classification | Memory utilization (70%) | CPU utilization (70%) | Spikes during scheduled scan windows |
| Preference Management | Request count per target | Response latency (p95) | Latency-sensitive; scale out early |
- Set a high but bounded upper limit on instance count to prevent runaway costs while ensuring availability.
- Use request count per target as the scaling trigger for latency-sensitive workloads (Preference Management).
- Use memory utilization as the scaling trigger for batch workloads (DSRs, Discovery & Classification).
- Configure scale-in cooldown periods of at least 5 minutes to avoid flapping.
- Due to Sombra's ability to scale either horizontally or vertically, you have many options on how to handle scaling. Using a proper autoscaling configuration helps ensure Sombra servers are highly available while keeping costs proportional to usage.
Different Transcend products place different demands on Sombra. Use the guidance below to adjust your resource allocation based on which products you have enabled.
DSR processing is Sombra's most common workload. Each DSR triggers requests to your connected data systems (SaaS vendors, databases, internal APIs) and processes the results.
| Tier | vCPU | Memory | Instances | When to use |
|---|---|---|---|---|
| Low volume | 2 | 4 GB | 1–2 | < 1,000 DSRs/month, few connected systems |
| Standard | 2 | 5 GB | 2–4 | 1,000–50,000 DSRs/month, typical system count |
| High volume | 2–3 | 6–8 GB | 4–10 | 50,000+ DSRs/month, or systems with large individual files |
Key considerations:
- DSR processing is asynchronous and resilient to temporary Sombra downtime. If instances are overloaded, integration plugins will retry automatically.
- Systems with large individual files (Google Drive, AWS S3, certain databases) benefit from vertical scaling (more RAM per instance) rather than just adding more instances.
- You can start with the minimum and scale up as needed. Overloaded instances will produce 500 errors in your action items, which is your signal to increase resources.
Discovery & Classification scans your connected systems to build an inventory of where personal data lives. This encompasses two types of work:
- Schema discovery identifies the structure of your data systems — tables, columns, fields, and their relationships.
- Content classification examines actual data values to classify them by data category (e.g., email address, SSN, date of birth).
Scans can be resource-intensive depending on the breadth of your data systems and how many are being scanned concurrently.
| Tier | vCPU | Memory | Instances | When to use |
|---|---|---|---|---|
| Standard | 2 | 5 GB | 2–4 | Moderate number of systems, schema discovery only |
| Large-scale | 2–3 | 6–8 GB | 4–10 | Many connected systems, wide schemas (hundreds of columns), content classification enabled, or frequent scan schedules |
Key considerations:
- Discovery scans run in the background and are resilient to Sombra restarts — scans will resume where they left off.
- Content classification samples a limited number of rows per column (not full table scans), so raw row count is generally not a scaling concern. The main driver of resource usage is schema breadth — the number of columns and subdatapoints being classified across your systems.
- When non-null sampling is enabled, Sombra may perform additional sampling passes to find non-null values for columns with sparse data, which increases resource usage.
- Scan frequency matters: if you run daily scans across dozens of systems, size toward the upper end.
- Discovery and DSR workloads share the same Sombra instances. If you run both, size for whichever is more demanding.
- Unstructured discovery (scanning file stores, document repositories, etc.) requires the LLM Classifier to be deployed. See the LLM Classifier documentation for its specific hardware and deployment requirements.
Preference Management handles synchronous API requests from your applications to read and write user consent preferences. Unlike DSRs and discovery, these requests are latency-sensitive — your end users are waiting for the response.
| Tier | vCPU | Memory | Instances | When to use |
|---|---|---|---|---|
| Standard | 2 | 5 GB | 2–4 | Up to ~10M preference API calls/month |
| High traffic | 3 | 8 GB | 5–10 | 10M–100M preference API calls/month |
| Very high traffic | 3 | 8 GB | 10–20 | 100M+ preference API calls/month, or strict latency requirements |
Key considerations:
- Sombra uptime is critical for Preference Management. Unlike DSRs, there is no built-in retry — if Sombra is down, preference reads/writes will fail for your end users.
- Autoscaling should be configured with aggressive scale-out (short cooldown periods, low request-count thresholds) to handle traffic spikes.
- We recommend always having at least 2 instances across separate availability zones for high availability.
- Monitor request latency (p95, p99) in addition to CPU/memory. If latency degrades, scale out horizontally before scaling up vertically.
If you are running multiple Transcend products through the same Sombra deployment, size your instances for the most demanding use case rather than summing the requirements. The table below provides quick reference for common combinations:
| Product combination | vCPU | Memory | Instances |
|---|---|---|---|
| DSRs only | 2 | 5 GB | 2–4 |
| DSRs + Discovery & Classification | 2–3 | 6–8 GB | 2–6 |
| DSRs + Preference Management | 3 | 8 GB | 4–10 |
| DSRs + Discovery & Classification + Preference Management | 3 | 8 GB | 5–15 |
| All products (full platform) | 3 | 8 GB | 5–20 |
Note: The LLM Classifier always runs as a separate service with its own GPU-enabled compute and does not affect Sombra instance sizing. See LLM Classifier for its resource requirements.
If you have strong requirements for high availability and want to minimize operational overhead, use the following configuration:
| Setting | Value |
|---|---|
| Memory per instance | 8 GB |
| vCPU per instance | 3 |
| Minimum instances | 2 |
| Maximum instances | 20 |
| Scaling trigger | Request count per target (preferred) or memory utilization |
| Availability zones | 2+ |
This setup will handle virtually any production workload we have seen, including high-volume DSRs, large-scale discovery & classification, and heavy Preference Management traffic. With autoscaling, you will not hit the upper instance limits unless demand requires it, keeping costs proportional to actual usage.
| Component | Minimum | Recommended | Heavy workload |
|---|---|---|---|
| Sombra vCPU | 2 | 2–3 | 3 |
| Sombra Memory | 4 GB | 5–8 GB | 8 GB |
| Sombra Instances | 1 | 2–4 | 5–20 |
| LLM Classifier | See LLM Classifier docs |