Compute Resource Recommendations
Sombra Instances are largely stateless, meaning they can easily be scaled both vertically and horizontally. Sombra is built to be resilient to downtime in many use cases such as data mapping and data privacy requests, though there are also cases like Preference Management where Sombra uptime is critical for proper handling of synchronous user requests.
This document will discuss the minimum requirements to get started with running Sombra, and will also include recommendations for how to scale out Sombra to meet your needs.
The bare minimum requirements for Sombra to be able to run and handle requests is about 4GB of RAM. We recommend starting with around 2 virtual CPUs on cloud systems for efficient startup times and process handling.
If using the LLM Cluster, you'll need to use a GPU-enabled server and will want to give at least 8GB of RAM to the service.
Our recommended hosting options, such as our Kubernetes Helm chart or our AWS ECS Fargate Terraform module, will specify autoscaling configurations for you, though you can always adjust the limits. Due to Sombras ability to scale either horizontally or vertically, you have many options on how to handle this scaling.
Here is an example an autoscaling configuration in our Sombra Helm chart.
Using a proper autoscaling configuration can help ensure that Sombra servers are highly available, but that your costs scale properly with your usage.
- Our primary recommendation for production is to use autoscaling with high, but bounded, upper limits on resource usage.
- We recommend always having at least two Sombra instances running for high availability. Preferrably, these instances would be in separate geographic locations such as different availability zones within an AWS region.
- For products like data mapping and privacy request automation, you can start with the minimum recommended usage and just scale up as you see errors happening. If your Sombra instances are becoming overloaded, you will start to see 500 errors related to the integration plugins showing up in your action items. the plugins will retry automatically, and you can just increase the CPUs or memory to the point where those errors stop happening - our team will help provide recommendations along the way.
- We encourage you to setup monitoring on the container to get a sense if CPU or memory is the bottleneck to any given operation
- If you have high requirements for availability or want to minimize maintenance, set a high limit for the horizontal scaling target, such as having up to dozens of Sombra instances running. Very few use cases will require more than a few Sombra instances, though cases like Preference Management with billions of incoming requests per day or data mapping extremely large databases can use significant resources. With proper autoscaling setups, you will not hit these upper limits unless necessary.
- If you are scanning systems with large individual files, such as Google Drive, AWS S3, or certain databases, we recommend also considering vertical scaling in addition to horizontal scaling to ensure the large files can be processed. In such cases, consider 6GB or 8GB of RAM per instance.
If you have strong requirements for high availability and want to absolutely minimize maintenance resourcing, the summary is to use 8GB of RAM per instance in an autoscaling setup that scales from 2 to 20 Sombra instances, with the scale in-out trigger being based on request count per target (or memory usage is okay if it's easier in your particular environment). Such a setup will scale (with plenty of ease) for just about any production use case we have seen.