Please provide the following information when creating a topic:
Hardware Platform (GPU model and numbers): 8xA100
System Memory
Ubuntu Version: Ubuntu 22.04.5 LTS
NVIDIA GPU Driver Version (valid for GPU only): 570.124.06
Issue Type( questions, new requirements, bugs): deployment issue
I am trying to deploy the VSS agent on a remote 8xA100 server. I have Early access for the VSS and the required NGC_API_KEY. I am trying to deploy using docker compose as described in the documentation however I am facing some issue while pulling the image. The login was successfull from the docker login nvcr.io step. The next described step was to pull the llama 3.1 70 B model from the hub, which is where the execution on the server failed. I ran the below command:
cd remote_llm_deployment
nano config.yaml → here I added the NGC and NVIDIA_API_KEYs
docker compose up
I got the above error
I also later tried the remote vlm deployment method:
cd remote_vlm_deployment
nano .env → added openai api key, ngc, nvidia keys
docker compose up
I got the same error in both cases. I’m not sure what the error is but when I tried the deployment a week ago on a brev launchable it had worked so I’m guessing its not due to expiry of my VSS.
As the deployments were done on remote servers provided by lambdalabs could it be issues with some permissions? I did use root permissions while running
they provide ubuntu servers with different configurations of GPUs with NVIDIA drivers, CUDA toolkit, docker pre installed. You can rent servers on a per hour basis
Ok so I have resolved that particular error by adding user to docker group using:
sudo usermod -aG docker ${USER}
Then I logged in using sudo which I hadn’t done previously. Now there is a new error that I face when I do docker compose up:
[+] Running 0/1
⠋ Container remote_vlm_deployment-via-server-1 Creating 0.0s
Error response from daemon: unknown or invalid runtime name: nvidia
CUDA version used:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jan_15_19:20:09_PST_2025
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0
NVIDIA driver: 570.xxx
Can nvidia driver 570 or CUDA version be an issue or would it be something else? The build process does seem to start and completes quite a bit until it fails
In the docs it is mentioned 535.X is the minimum recommended version so I thought version 570.X should work?
Instead of deploying the 70b model I deployed the llama3.1 8b model to test and it deployed without any issues. I verified with a curl request and I got a response indicating the deployment was successful
In the case that I have to downgrade my nvidia drivers from 570.X to 535.X what would be the best way to go about it?