Sombra LLM Classifier

The LLM (Large Language Model) Classifier is an application designed to improve data classification using advanced natural language processing techniques. It is a separate service that is attached to Sombra. Customers self-hosting Sombra can run LLM Classifier in same private network as Sombra.

The LLM Classifier container runs a gunicorn server to listen to requests and perform LLM classification on the inputs provided in the request body. As the LLM runs a mathematically computationally expensive operation, the container must run on a node with NVIDIA Ampere GPU such as the A10.

The LLM Classifier container by default listens on port 6081. This can be changed using the LLM_SERVER_PORT environment variable.

To enable HTTPS connection to the LLM Classifier server you can mount the SSL cert and key file to the container and set the path to these files using environment variables LLM_CERT_PATH and LLM_KEY_PATH respectively.

You can pull our image from Transcend's private Docker registry using basic authentication.

First, please contact us and request permission to pull the llm-classifier image. We will then add your Transcend account to our permissions list.

Once we have added you to our allow list, you can log in to our private registry:

docker login docker.transcend.io

You will be prompted to enter the basic auth credentials. The username will always be "Transcend" (this is case-sensitive), and the password will be any API Key for your organization within the Admin Dashboard (note: a scope is not required for the API key).

Once you've logged in, you may pull images by running:

docker pull docker.transcend.io/llm-classifier:<version_tag>

You can deploy the LLM Classifier alongside Sombra in the same private network and configure your network to allow Sombra to reach the Classifier.

You can follow our Helm chart guide to deploy LLM Classifier along with Sombra in a Kubernets cluster or use following sample config to deploy the LLM Classifier in a Kubernetes cluster.

apiVersion: v1
kind: Namespace
metadata:
  name: transcend
---
apiVersion: v1
kind: Service
metadata:
  name: llm-classifier-ingress
  namespace: transcend
spec:
  selector:
    app: llm-classifier-app
  ports:
    - protocol: TCP
      port: 6081
      targetPort: 6081
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llm-classifier-app
  namespace: transcend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: llm-classifier-app
  template:
    metadata:
      labels:
        app: llm-classifier-app
    spec:
      containers:
        - name: llm-classifier-container
          image: llm-classifier:<version_tag>
          ports:
            - name: http
              containerPort: 6081
              protocol: TCP
          env:
            - name: LLM_SERVER_PORT
              value: '6081'
          resources:
            limits:
              memory: 8Gi
              nvidia.com/gpu: '1'
          livenessProbe:
            httpGet:
              path: /health/ping
              port: 6081
              scheme: HTTP
            timeoutSeconds: 30
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 10
          startupProbe:
            httpGet:
              path: /health/ping
              port: 6081
              scheme: HTTP
            timeoutSeconds: 30
            periodSeconds: 20
            successThreshold: 1
            failureThreshold: 10

Update the LLM_CLASSIFIER_URL environment variable in your Sombra deployment to point to the llm-classifier-ingress created above: LLM_CLASSIFIER_URL=http://<llm-classifier-service-cluster-ip>:<llm_service_port>

In the event you have an issue with the LLM that requires Transcend support be involved, the LLM has a configuration option that allows it to send its logs to Transcend's servers.

Please be aware that this option may expose metadata from your data silos to Transcend, and we recommend turning this feature off once the issue has been resolved.

In order to use this feature, you will need to generate an API key with the LLM Log Transfer scope. You can do this in the Admin Dashboard by selecting API Keys under Developer Tools, then creating a new key (by clicking the + button in the upper right) and creating an API key as such:

Generate API Key

Please set the LOG_HTTP_TRANSPORT_URL to https://collector.transcend.io/api/v1/logs if you are hosting Transcend in the EU, and https://collector.us.transcend.io/api/v1/logs if you are hosting Transcend in the US.

Environment VariableDescriptionRequired
LOG_HTTP_TRANSPORT_URLThe Transcend Collector's HTTPS ingress endpoint.Yes
LOG_FORWARDING_TRANSCEND_API_KEYThe log forwarding API keyYes

We recommend using a node with an NVIDIA A10G GPU, such as AWS’s g5.2xlarge instance. Or Google's A2 machine series With the LLM Classifier running on 2 g5.2xlarge nodes, you can get about 18,000 classifications per hour. On-demand pricing per node is $1.212 per hour (and 3-yr reserved is $0.485) as of September 3, 2024. If you need more throughput (more classifications per hour), you can add more instances of the LLM Classifier to linearly scale the throughput.