Helm (Recommended)
This is our recommended approach for deploying Sombra in production environments. While there are various ways to deploy Sombra and related components, we recommend Helm for most users, especially those with multi-cloud or hybrid cloud environments.
- Access to our container registry: Before starting, you need access to Transcend's container registry to pull our Sombra image. If you don't have access or are unsure, contact your Transcend account representative.
- Environment variables: Follow Steps 1–3 in our Quickstart guide to get the environment variables you'll need when configuring Sombra.
Follow your hosting provider's instructions to launch a Kubernetes cluster with GPUs attached. You should get to the point where you can connect to the cluster with kubectl
.
Set up your cluster on Amazon Elastic Kubernetes Service (EKS).
- Follow NVIDIA's guide to deploy GPU nodes on EKS.
- Connect
kubectl
to EKS. On AWS, yourkubeconfig
is typically configured via theaws eks update-kubeconfig
command. If you usedeksctl
, yourkubeconfig
may have been configured automatically.
For a complete Terraform example that creates all required AWS infrastructure, see the Terraform code in our Quickstart guide.
Set up your cluster on Azure Kubernetes Service (AKS).
- Follow NVIDIA's guide to deploy GPU nodes on AKS.
- Connect
kubectl
to AKS. On Azure, yourkubeconfig
is typically configured viaaz aks get-credentials
.
Set up your cluster on Google Kubernetes Engine (GKE).
- Follow NVIDIA's guide to deploy GPU nodes on GKE.
- Connect
kubectl
to GKE. On GCP, yourkubeconfig
is typically configured viagcloud container clusters get-credentials
.
For on-prem deployments, you can set up your Kubernetes cluster using one of the following solutions:
- Kubeadm: A tool built to easily bootstrap a minimum viable Kubernetes cluster. It is ideal for users familiar with Kubernetes and who require granular control over their cluster configuration. Follow the Kubeadm Setup Guide.
- Rancher: An open-source platform that provides a complete Kubernetes management solution to deploy and manage clusters. Follow the Rancher Setup Guide.
- VMware Tanzu: VMware’s enterprise-grade Kubernetes offering that integrates tightly with VMware infrastructure, making it suitable for environments already invested in VMware. Follow the VMware Tanzu Setup Guide.
- OpenShift: A Kubernetes distribution by Red Hat that provides an enterprise-ready platform with additional tools for development and operations. Follow the OpenShift Setup Guide.
Follow NVIDIA's guides to deploy GPUs to your Kubernetes cluster in your on premises environment.
You can test a Kubernetes cluster locally using one of these services:
Note: if you do not have an NVIDIA GPU on your machine, you will not be able to use the LLM Classifier in your cluster, which depends on NVIDIA GPUs.
The guide you followed in Step 1 may have already set up your kubectl
with your cluster's context. If so, you can proceed to Step 3.
If you haven't already installed the kubectl
CLI, follow these instructions to install kubectl
.
Check if your cluster's context is already available with:
kubectl config get-contexts
If you do not see your new cluster's context, you need to configure kubectl
(via your kubeconfig
file) to connect to your cluster. Check the documentation of your hosting provider.
- On AWS, your
kubeconfig
is typically configured via theaws eks update-kubeconfig
command. If you usedeksctl
to set up your cluster, then yourkubeconfig
may have been configured automatically. - On Azure, your
kubeconfig
is typically configured viaaz aks get-credentials
. - On GCP, your
kubeconfig
is typically configured viagcloud container clusters get-credentials
.
Once you have the NAME of your cluster's context (from kubectl config get-contexts
), set kubectl
to its context:
kubectl config use-context MY_CONTEXT_NAME
Add Transcend's Helm repository:
helm repo add transcend https://transcend-io.github.io/helm-charts/ helm repo update
You can list the charts available in our Helm repository with helm search repo transcend
. You should see a chart for transcend/sombra
.
Create a file named values.yaml
with the following content:
imageCredentials: password: <API_KEY_FROM_QUICKSTART_STEP_1> replicaCount: 1 llm-classifier: enabled: true envs: - name: ORGANIZATION_URI value: <ORGANIZATION_URI_FROM_QUICKSTART_STEP_2> - name: SOMBRA_ID value: <SOMBRA_ID_FROM_QUICKSTART_STEP_2> - name: SOMBRA_REVERSE_TUNNEL_API_KEY value: <SOMBRA_REVERSE_TUNNEL_API_KEY_FROM_QUICKSTART_STEP_2> - name: TRANSCEND_URL value: <TRANSCEND_URL_FROM_QUICKSTART_STEP_2> - name: LLM_CLASSIFIER_URL value: http://sombra-test-llm-classifier.transcend.svc:6081 envs_as_secret: - name: JWT_ECDSA_KEY value: '<JWT_ECDSA_KEY_FROM_QUICKSTART_STEP_3>'
Missing some variables?
Refer to Steps 1–3 in our Quickstart guide to get the environment variables.
Now, deploy Sombra to your cluster. In Helm terms, you will install your chart onto your cluster, and each time you install, that is called a "release".
Install the sombra
chart onto your cluster:
helm install sombra-test transcend/sombra --values=./values.yaml
For this test run, we've chosen the name "sombra-test" for our release. You can replace "sombra-test" with any name you'd like. The release name is used only to distinguish between several installations of the same chart in your cluster—which is not a typical situation with Sombra. It should be a lowercase string with dashes.
First, check that the Helm chart installed into your cluster:
kubectl get all --namespace="transcend"
On namespaces: the chart will create a namespace in your cluster called
transcend
. Most calls we'll make tokubectl
require--namespace="transcend"
(or in short form:-n transcend
). Bare calls (e.g.,kubectl get all
) will turn up empty results, since it's looking in the default namespace (nameddefault
).Tip: Rather than type
--namespace="transcend"
each time, add an alias:Shellalias k="kubectl --namespace=transcend" k get all # kubectl --namespace="transcend" get all k events # kubectl --namespace="transcend" events
Go to the Sombra Gateways page in the Transcend Admin Dashboard and use the "Test Gateway Connection" button to verify connectivity. A successful response confirms that your Sombra and LLM Classifier deployment is working correctly.
The Customizing Sombra guide covers all options available, but we recommend:
-
Adding bearer authentication for your Sombra clients with the
INTERNAL_KEY_HASH
variable.Here is an example of configuring a load balancer in AWS with custom DNS and SSL via AWS ACM. As long as your services can connect to the ingress, you can configure this Kubernetes ingress however you wish.
YAMLcustomer_service: type: NodePort customer_ingress: enabled: true className: alb annotations: alb.ingress.kubernetes.io/certificate-arn: <CERT_ARN> alb.ingress.kubernetes.io/healthcheck-path: /health alb.ingress.kubernetes.io/healthcheck-port: '5039' alb.ingress.kubernetes.io/healthcheck-protocol: HTTP alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 5039}]' alb.ingress.kubernetes.io/scheme: internal alb.ingress.kubernetes.io/subnets: <VPC_PRIVATE_SUBNET> alb.ingress.kubernetes.io/tags: env=dev alb.ingress.kubernetes.io/target-type: ip hosts: - host: <SOMBRA_CUSTOMER_INGRESS_DOMAIN> paths: - path: / pathType: Prefix ... envs_as_secret: ... - name: INTERNAL_KEY_HASH value: '<INTERNAL_KEY_HASH>'
-
Setting up SSO for your admin users
If you encounter issues during deployment, follow these steps:
-
Diagnose the issue using Kubernetes events and logs:
Shellkubectl events -n transcend kubectl logs -n transcend deployment/sombra-test
-
Fix the issue based on the error messages (see common issues below)
-
Upgrade the release with your changes:
Shellhelm upgrade sombra-test transcend/sombra --values=./values.yaml
You don't have access to our Docker registry. Please reach out to Transcend support. Our support team manually allowlists which organizations have access to our images.
$ kubectl events -n transcend LAST SEEN TYPE REASON OBJECT MESSAGE 6m44s Warning Failed Pod/sombra-test-559b7796b9-j7r2w Failed to pull image "docker.transcend.io/sombra:latest": Error response from daemon: error parsing HTTP 403 response body: no error details found in HTTP response body: "{\"Message\":\"User is not authorized to access this resource with an explicit deny\"}\n"
$ kubectl events -n transcend LAST SEEN TYPE REASON OBJECT MESSAGE 27m (x2 over 32m) Warning FailedScheduling Pod/sombra-test-llm-classifier-5c9486974b-drq9v 0/1 nodes are available: 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
You need to increase the memory in your cluster.
$ kubectl events -n transcend LAST SEEN TYPE REASON OBJECT MESSAGE 37m Warning FailedScheduling Pod/sombra-test-llm-classifier-5c9486974b-drq9v 0/1 nodes are available: 1 Insufficient nvidia.com/gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Your cluster needs to have an NVIDIA GPU and the relevant drivers installed. This is managed by the NVIDIA GPU Operator. Have you installed the GPU Operator into your cluster? See the NVIDIA guide for your hosting environment (linked in Step 1).
# Update the Helm repository helm repo update # Update the release helm upgrade sombra-test transcend/sombra --values=./values.yaml