LLM Classifier
The LLM (Large Language Model) Classifier is an application designed to improve data classification accuracy using advanced natural language processing techniques. It is a separate, optional service that is attached to Sombra. Customers self-hosting Sombra can run the LLM Classifier in the same private network as Sombra.
Our deployment guides provide instructions on how to deploy the LLM Classifier using different configuration setups. If you are using our recommended Helm Chart, deploying the LLM Classifier simply requires adding:
llm-classifier: enabled: true
to your values.yaml
file.
The LLM Classifier container runs a gunicorn server that listens to requests and performs LLM classification on the inputs provided in the request body. Since the LLM perform more efficiently on Graphics Processing Units (GPUs), the container must run on a node with an NVIDIA GPU, such as the A10.
The LLM Classifier container by default listens on port 6081
. This can be changed using the LLM_SERVER_PORT
environment variable.
To enable HTTPS connections to the LLM Classifier server, you can mount the SSL certificate and key file to the container and set the path to these files using the environment variables LLM_CERT_PATH
and LLM_KEY_PATH
respectively.
You can pull our image from Transcend's private Docker registry using basic authentication.
First, please contact us and request permission to pull the llm-classifier
image. We will then add your Transcend account to our permissions list.
Once we have added you to our allow list, you can log in to our private registry:
docker login docker.transcend.io
You will be prompted to enter the basic auth credentials. The username will always be "Transcend" (this is case-sensitive), and the password will be any API Key for your organization within the Admin Dashboard (note: a scope is not required for the API key).
Once you've logged in, you may pull images by running:
docker pull docker.transcend.io/llm-classifier:<version_tag>
We recommend using a node with an NVIDIA A10G GPU, such as AWS's g5.2xlarge
instance or Google's A2 machine series.
With the LLM Classifier running on 2 g5.2xlarge
nodes, you can process approximately 18,000 classifications per hour. On-demand pricing per node is $1.212 per hour (and 3-year reserved pricing is $0.485) as of April 2023.
If you need more throughput (more classifications per hour), you can add more instances of the LLM Classifier to linearly scale the throughput.
This values.yaml
adds an accompanying LLM Classifier to your Sombra deployment. The LLM Classifier requires an NVIDIA GPU to run, so please make sure your cluster supports nvidia.com/gpu
as a resource.
envs: # ... other env vars - name: LLM_CLASSIFIER_URL value: http://<release-name>-llm-classifier.transcend.svc:6081 llm-classifier: enabled: true
Or with TLS termination at Sombra and the LLM Classifier server:
envs: # ... other env vars - name: LLM_CLASSIFIER_URL value: https://<release-name>-llm-classifier.transcend.svc:6081 envs_as_secret: # ... other env vars - name: SOMBRA_TLS_CERT value: <SOMBRA_TLS_CERT> - name: SOMBRA_TLS_KEY value: <SOMBRA_TLS_KEY> # An optional passphrase associated with your TLS private key. If you set a passphrase when you created your key and certificate, you must provide it here. - name: SOMBRA_TLS_KEY_PASSPHRASE value: <SOMBRA_TLS_KEY_PASSPHRASE> llm-classifier: enabled: true tls: enabled: true # saved as secret cert: |- -----BEGIN CERTIFICATE----- <base64> -----END CERTIFICATE----- # saved as secret key: |- -----BEGIN PRIVATE KEY----- <base64> -----END PRIVATE KEY----- # volume containing cert and key volumes: - name: llm-classifier-ssl secret: secretName: llm-classifier-secrets # mount the directory containing the cert and key to pod volumeMounts: - mountPath: '/etc/llm-classifier/ssl' name: llm-classifier-ssl readOnly: true # Set the location of cert and key in environment envs: - name: LLM_CERT_PATH value: '/etc/llm-classifier/ssl/llm-classifier.cert' - name: LLM_KEY_PATH value: '/etc/llm-classifier/ssl/llm-classifier.key'