NIM embedding model downloads but fails with auth error on startup

These steps are based on https://build.nvidia.com/nvidia/nv-embedqa-e5-v5?snippet_tab=Docker

Device: Ubuntu desktop over ssh.

Authenticate to nvcr.io with `docker login nvcr.io

(base) joe@hp-z820:~$ docker login nvcr.io
Username: $oauthtoken
Password:
WARNING! Your password will be stored unencrypted in /home/joe/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

Then I run the docker command to run the image. The downloading and building the image succeeds. But then the image fails to download resources.

(base) joe@hp-z820:~$ bash nim-embedding-docker.sh
Unable to find image 'nvcr.io/nim/nvidia/nv-embedqa-e5-v5:1.0.0' locally
1.0.0: Pulling from nim/nvidia/nv-embedqa-e5-v5
a8b1c5f80c2d: Pulling fs layer
56e0177fe82f: Pulling fs layer
4f4fb700ef54: Pulling fs layer
4cf678acad36: Pulling fs layer
4cf678acad36: Waiting                                                                                                                                                                                                                                c4e9a39a39cb: Waiting                                                                                                                                                                                                                                722debad5c6a: Waiting                                                                                                                                                                                                                                3483c58afaf2: Waiting                                                                                                                                                                                                                                2e2526a1f7fc: Waiting                                                                                                                                                                                                                                496b316ab8c8: Waiting                                                                                                                                                                                                                                eb5b90d0a722: Waiting
b0176daeac19: Waiting                                                                                                                                                                                                                                72c4e897fc78: Waiting                                                                                                                                                                                                                                bd2a9cf1b9df: Waiting                                                                                                                                                                                                                                926f74aa6f19: Pulling fs layer
b1c7ffdebc6a: Waiting                                                                                                                                                                                                                                a5dd11fafb60: Waiting                                                                                                                                                                                                                                0f3c3effde75: Waiting                                                                                                                                                                                                                                d28cfe6559de: Waiting
e18f944c176c: Waiting                                                                                                                                                                                                                                d9b15bbaae55: Waiting                                                                                                                                                                                                                                9564aa11ac85: Waiting                                                                                                                                                                                                                                17458f446788: Waiting                                                                                                                                                                                                                                0d4a0b3db58c: Waiting                                                                                                                                                                                                                                d46f7e04d572: Pulling fs layer
4d49242549b3: Waiting                                                                                                                                                                                                                                374b4a6f077c: Pull complete
3d2ad0a96da8: Pull complete
2ba69f19073f: Pull complete
9284f3662544: Pull complete
53ec11b3dc1e: Pull complete
224aeb200553: Pull complete
e795b15a9830: Pull complete
6958c8f6d73b: Pull complete
edea289153a5: Pull complete
f3bb4e9447f9: Pull complete
f646328c579e: Pull complete
c8ffb8b7150b: Pull complete
2d850ecaedea: Pull complete
c4b5b4200956: Pull complete                                                                                                                                                                                                                          5e931e8f849d: Pull complete
3f061f630cca: Pull complete
aa1aa2ead46f: Pull complete
a0fec4672978: Pull complete
6a54b2f82edb: Pull complete
7a19d032e2fd: Pull complete                                                                                                                                                                                                                          918a62ce3467: Pull complete
1618c4ef78e9: Pull complete
d027a1ba1141: Pull complete
636d041245b3: Pull complete
738cb8374ebc: Pull complete
62ea13c60b37: Pull complete
567bf82dbcb7: Pull complete
afe4761b0b93: Pull complete
1e3e5ff2e94b: Pull complete
Digest: sha256:9021d5f8d9bb0ab8847355153dc837249cf6b8f191d4397d326f6185aab6a454
Status: Downloaded newer image for nvcr.io/nim/nvidia/nv-embedqa-e5-v5:1.0.0

=========================================
== NVIDIA Retriever Text Embedding NIM ==
=========================================

NVIDIA Release 1.0.0-rc19 (build 7d72a70ff4bb0ae21b4b56cf07070a350cfe6965)
Model: nvidia/nv-embedqa-e5-v5

Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This NIM container is governed by the NVIDIA AI Product Agreement here:
https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/.
A copy of this license can be found under /opt/nim/LICENSE.

The use of this model is governed by the AI Foundation Models Community License
here: https://docs.nvidia.com/ai-foundation-models-community-license.pdf.

OTEL Logging handler requested, but Python logging auto-instrumentation not set up. Set OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true to enable logging auto-instrumentation.
downloading nim/nvidia/nv-embedqa-e5-v5:5_tokenizer
This could take a while.
--2024-08-14 00:07:43--  https://api.ngc.nvidia.com/v2/org/nim/team/nvidia/models/nv-embedqa-e5-v5/versions/5_tokenizer/zip
Resolving api.ngc.nvidia.com (api.ngc.nvidia.com)... 44.239.225.48, 34.214.22.13
Connecting to api.ngc.nvidia.com (api.ngc.nvidia.com)|44.239.225.48|:443... connected.
HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed.

2024-08-14T00:07:44Z WARNING: tools.nim.ngc_models - Failed to download from nim/nvidia/nv-embedqa-e5-v5:5_tokenizer: Failure from NGC CLI -

The script is

export NGC_API_KEY=nvap-<along string>

export NIM_MODEL_NAME=nvidia/nv-embedqa-e5-v5
export CONTAINER_NAME=$(basename $NIM_MODEL_NAME)

# Choose a NIM Image from NGC
export IMG_NAME="nvcr.io/nim/$NIM_MODEL_NAME:1.0.0"

# Choose a path on your system to cache the downloaded models
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"

# Start the NIM
docker run -it --rm --name=$CONTAINER_NAME \
  --runtime=nvidia \
  --gpus all \
  --shm-size=16GB \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -u $(id -u) \
  -p 8000:8000 \
  $IMG_NAME

Is this error coming from the bootstrap inside the container?

Can you confirm that you created a personal API key through your NGC profile? That is what is expected in the NGC_API_KEY section of the script.

image

I have two keys. Tried both. Both start with ‘nvapi-‘. They both worked for getting the image from the container repo. They both have access bits

nv-cloud-functions
artifact-catalog

The one created by ai workbench also had

secrets-manager

Also tried the key generated on this page NVIDIA NIM | nv-embedqa-e5-v5 with no success

Just to cover our troubleshooting bases - can you please try to generate an API key without the nvapi-?

image

That worked.

  1. Generated an API key here https://org.ngc.nvidia.com/setup/api-key
  2. The newly generated key did not start with nvapi-

That is kind of unfortunate if I need to set the NGC_API_KEY differently in different places. I’m pretty sure the NGC_API_KEY environment variable in the NIM anywhere project is one of the nvapi-... keys. We see this problem in the NIM-anywhere project.

It is possible I picked up model version 1.0.0 in one of the rounds of testing. Personal Access Token support looks to have been added in version 1.0.1

1 Like

same issue I am facing now with the steps based on: NVIDIA NIM for Text Embedding | NVIDIA NGC

Device: ubuntu 22.04

docker login nvcr.io is successfull

when I deployed embedding nim model version 1.0.0 same issue I am facing

I am using ngc api key only generated from: org.ngc.nvidia.com/setup/api-key

what might be the issue here?

We are having the same problem.

We are deploying the nv-embedqa-mistral-7b-v2:1.0.1 model on a DGX A100 cluster using Kubernetes with the following configuration:

replicaCount: 1
image:
  repository: "nvcr.io/nim/nvidia/nv-embedqa-mistral-7b-v2"
  tag: "1.0.1"
  pullPolicy: IfNotPresent

During deployment, the service fails to initialize due to an attempt to download a missing asset (2_tokenizer_512) from NVIDIA NGC, resulting in an HTTP 410 Gone error. Despite preloading assets and trying configuration changes, the container appears to have a hardcoded dependency on this unavailable resource, blocking deployment.

1 Like

Same issue with nvcr.io/nim/nvidia/nv-rerankqa-mistral-4b-v3:1.0.2
We are able to download the image, but when the container starts, it fails with this:
=========================================
== NVIDIA Retriever Text Reranking NIM ==
=========================================
NVIDIA Release 1.0.2 (build 94101e56e865f68d5dfadb5d02af31d64d86b8eb)
Model: nvidia/nv-rerankqa-mistral-4b-v3
Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This NIM container is governed by the NVIDIA AI Product Agreement here:
https://www.nvidia.com/en-us/data-center/products/nvidia-ai-enterprise/eula/.
A copy of this license can be found under /opt/nim/LICENSE.
The use of this model is governed by the AI Foundation Models Community License
here: https://docs.nvidia.com/ai-foundation-models-community-license.pdf.
OTEL Logging handler requested, but Python logging auto-instrumentation not set up. Set OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true to enable logging auto-instrumentation.
downloading nim/nvidia/nv-rerankqa-mistral-4b-v3:3_tokenizer_v3
This could take a while.
–2025-01-21 17:30:56-- https://api.ngc.nvidia.com/v2/org/nim/team/nvidia/models/nv-rerankqa-mistral-4b-v3/versions/3_tokenizer_v3/zip
Resolving api.ngc.nvidia.com (api.ngc.nvidia.com)… 54.68.27.61, 44.227.231.63
Connecting to api.ngc.nvidia.com (api.ngc.nvidia.com)|54.68.27.61|:443… connected.
HTTP request sent, awaiting response… 410 Gone
2025-01-21 17:30:57 ERROR 410: Gone.
2025-01-21T17:30:57Z WARNING: tools.nim.ngc_models - Failed to download from nim/nvidia/nv-rerankqa-mistral-4b-v3:3_tokenizer_v3: Failure from NGC CLI -

Hi @shawn56 sorry you’re running into a problem here. From the last few lines of the log you shared it looks like you’re not able to authenticate to download the container. (see the 410 error). Are you able to check that you are logged into ngc on the system?

I am setting the NGC_API_KEY environment variable to an API key that shows as valid. Beyond that, the workings inside the container are a black box to me. It was working last week–same image, same API key–then stopped working sometime after Jan. 16th.

Thanks for the info - let me reach out to the right team. Thanks for your patience!

In the mean time, are you able to request a new API key and try using that instead? Would be interested to know if that solves the issue.

I tried that before–it was my first thought too–but I got the same error. I tried creating a new API key just now and giving the key all 4 permissions that the UI lets me pick, but I got the same error. I am running the image in Kubernetes, and I ran kubectl describe pod and verified that NGC_API_KEY environment variables for the pod is configured with the new value.

Hi @shawn56 the NIM team is recommending that you move to the newer reranker NIM - llama-3.2-nv-rerankqa-1b-v2 Model by NVIDIA | NVIDIA NIM & NVIDIA Retrieval QA Llama 3.2 1B Reranking v2 | NVIDIA NGC as the one you are seeing an error on has been depreciated. Thanks, Sophie