Hi,
I am trying to install/start NIM Version 1.0.0 with Model: nim/meta/llama3-8b-instruct. I have my personal api key set as a bash variable NGC_API_KEY and in ngc config set. I can login to NGC docker fine. However when I start the NIM container, it gives me the below error about not having an API key. I have tried:
- Using -e NGC_API_KEY in docker run
- Specifying the full api key with -e (apikey-value)
- Creating a new personal api key in NGC account
[cisco@ai-alma-02:+1][1]$echo “$NGC_API_KEY” | docker login nvcr.io --username ‘$oauthtoken’ --password-stdin
WARNING! Your password will be stored unencrypted in /home/cisco/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credential-stores
Login Succeeded
[cisco@ai-alma-02:+1][0]$sudo docker run -it --rm
=> --runtime=nvidia
=> --gpus all
=> --shm-size=16g
=> -e NGC_API_KEY
=> -v “$LOCAL_NIM_CACHE:/opt/nim/.cache”
=> -u $(id -u)
=> -p 8000:8000
=> --name=llama3-8b-instruct
=> $IMG_NAME
===========================================
== NVIDIA Inference Microservice LLM NIM ==
NVIDIA Inference Microservice LLM NIM Version 1.0.0
Model: nim/meta/llama3-8b-instruct
Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This NIM container is governed by the NVIDIA AI Product Agreement here:
NVIDIA AI Enterprise Software License Agreement | NVIDIA.
A copy of this license can be found under /opt/nim/LICENSE.
The use of this model is governed by the AI Foundation Models Community License
here: https://docs.nvidia.com/ai-foundation-models-community-license.pdf.
ADDITIONAL INFORMATION: Meta Llama 3 Community License, Built with Meta Llama 3.
A copy of the Llama 3 license can be found under /opt/nim/MODEL_LICENSE.
2024-07-01 02:45:54,473 [INFO] PyTorch version 2.2.2 available.
2024-07-01 02:45:54,980 [WARNING] [TRT-LLM] [W] Logger level already set from environment. Discard new verbosity: error
2024-07-01 02:45:54,980 [INFO] [TRT-LLM] [I] Starting TensorRT-LLM init.
2024-07-01 02:45:55,057 [INFO] [TRT-LLM] [I] TensorRT-LLM inited.
[TensorRT-LLM] TensorRT-LLM version: 0.10.1.dev2024053000
INFO 07-01 02:45:55.876 api_server.py:489] NIM LLM API version 1.0.0
INFO 07-01 02:45:55.878 ngc_profile.py:217] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.
INFO 07-01 02:45:55.878 ngc_profile.py:219] Detected 1 compatible profile(s).
INFO 07-01 02:45:55.878 ngc_injector.py:106] Valid profile: 8835c31752fbc67ef658b20a9f78e056914fdef0660206d82f252d62fd96064d (vllm-fp16-tp1) on GPUs [0]
INFO 07-01 02:45:55.878 ngc_injector.py:141] Selected profile: 8835c31752fbc67ef658b20a9f78e056914fdef0660206d82f252d62fd96064d (vllm-fp16-tp1)
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/usr/local/lib/python3.10/dist-packages/vllm_nvext/entrypoints/openai/api_server.py”, line 492, in
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/usr/local/lib/python3.10/dist-packages/vllm_nvext/hub/ngc_injector.py”, line 143, in inject_ngc_hub
repo = optimal_config.workspace()
Exception: Error {
context: “initializing ngc repo from repo_id: ngc://nim/meta/llama3-8b-instruct:hf”,
source: CommonError(
Error {
context: “fetching file_map”,
source: CommonError(
Error {
context: “get bearer token”,
source: CommonError(
“Authentication required; however no API key is detected.\nPlease set the env variable NGC_API_KEY with the API key acquired from NGC. See: https://org.ngc.nvidia.com/setup/api-key.”,
),
},
),
},
),
}