401 unauthorized access

Please provide the following information when creating a topic:

  • Hardware Platform (GPU model and numbers): 8xA100
  • System Memory
  • Ubuntu Version: Ubuntu 22.04.5 LTS
  • NVIDIA GPU Driver Version (valid for GPU only): 570.124.06
  • Issue Type( questions, new requirements, bugs): deployment issue

I am trying to deploy the VSS agent on a remote 8xA100 server. I have Early access for the VSS and the required NGC_API_KEY. I am trying to deploy using docker compose as described in the documentation however I am facing some issue while pulling the image. The login was successfull from the docker login nvcr.io step. The next described step was to pull the llama 3.1 70 B model from the hub, which is where the execution on the server failed. I ran the below command:

export NGC_API_KEY=
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p “$LOCAL_NIM_CACHE”
docker run -d -it
–gpus ‘“device=1,2”’
–shm-size=16GB
-e NGC_API_KEY
-v “$LOCAL_NIM_CACHE:/opt/nim/.cache”
-u $(id -u)
-p 8000:8000
nvcr.io/nim/meta/llama-3.1-70b-instruct:1.3.3

But I get this error:

Unable to find image ‘nvcr.io/nim/meta/llama-3.1-70b-instruct:1.3.3’ locally
docker: Error response from daemon: unauthorized:

401 Authorization Required

401 Authorization Required


nginx/1.22.1

Is this an access issue or login issue?

Did you follow our Guide step by step as described from the deploy-using-docker-compose?

Yes, I followed the following steps while doing the deployment:

git clone GitHub - NVIDIA-AI-Blueprints/video-search-and-summarization: Blueprint for Ingesting massive volumes of live or archived videos and extract insights for summarization and interactive Q&A
cd video-search-and-summarization/deploy/docker
docker login nvcr.io

After the login was successful,

cd remote_llm_deployment
nano config.yaml → here I added the NGC and NVIDIA_API_KEYs
docker compose up

I got the above error

I also later tried the remote vlm deployment method:

cd remote_vlm_deployment
nano .env → added openai api key, ngc, nvidia keys
docker compose up

I got the same error in both cases. I’m not sure what the error is but when I tried the deployment a week ago on a brev launchable it had worked so I’m guessing its not due to expiry of my VSS.
As the deployments were done on remote servers provided by lambdalabs could it be issues with some permissions? I did use root permissions while running

What does this “lambdalabs” specifically refer to?

they provide ubuntu servers with different configurations of GPUs with NVIDIA drivers, CUDA toolkit, docker pre installed. You can rent servers on a per hour basis

Ok so I have resolved that particular error by adding user to docker group using:

sudo usermod -aG docker ${USER}

Then I logged in using sudo which I hadn’t done previously. Now there is a new error that I face when I do docker compose up:

[+] Running 0/1
⠋ Container remote_vlm_deployment-via-server-1 Creating 0.0s
Error response from daemon: unknown or invalid runtime name: nvidia

CUDA version used:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jan_15_19:20:09_PST_2025
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0

NVIDIA driver: 570.xxx

Can nvidia driver 570 or CUDA version be an issue or would it be something else? The build process does seem to start and completes quite a bit until it fails

Are you able to run the steps for deploying the Llama 3.1 70b NIM per this page?

GPU version 570 has not been validated. The prerequisites instruct to install version 535.X

Hi,

In the docs it is mentioned 535.X is the minimum recommended version so I thought version 570.X should work?
Instead of deploying the 70b model I deployed the llama3.1 8b model to test and it deployed without any issues. I verified with a curl request and I got a response indicating the deployment was successful

In the case that I have to downgrade my nvidia drivers from 570.X to 535.X what would be the best way to go about it?

Some build logs for the same

[+] Running 4/5
✔ Network remote_vlm_deployment_default Created 0.1s
✔ Volume “remote_vlm_deployment_via-ngc-model-cache” Created 0.0s
✔ Volume “remote_vlm_deployment_via-hf-cache” Created 0.0s
✔ Container remote_vlm_deployment-graph-db-1 Created 2.1s
⠋ Container remote_vlm_deployment-via-server-1 Creating 0.0s
Error response from daemon: unknown or invalid runtime name: nvidia

Have you installed the container toolkit by referring to our container-toolkit/install-guide?

Hi,

Thanks for the response, I needed to configue the container runtime by using:
sudo nvidia-ctk runtime configure --runtime=docker

After that it successfully ran, Thank you!