Running NIM llama-3_1-8b-instruct fails in On-Prem deployment

manjunath.janardhan1 · October 9, 2024, 8:51am

Trying to deploy llama-3_1-8b-instruct NIM container for Inferencing On-Prem.

Am getting below error. It is behind the firewall but I was able to use huggingface_cli to download the model as proxy settings are inplace. I was able to pull the NIM but while running it is trying to download some files and it is failing. Need to help.

INFO 2024-10-09 08:42:35.718 ngc_injector.py:206] Selected profile: 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5 (vllm-bf16-tp1)
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: feat_lora: false
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: llm_engine: vllm
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: precision: bf16
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: tp: 1
INFO 2024-10-09 08:42:35.719 ngc_injector.py:245] Preparing model workspace. This step might download additional files to run the model.
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:117] One or more errors fetching files:
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url
Repeats***
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/vllm_nvext/entrypoints/openai/api_server.py”, line 654, in
inference_env = prepare_environment()
File “/opt/nim/llm/vllm_nvext/entrypoints/args.py”, line 155, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/vllm_nvext/hub/ngc_injector.py”, line 247, in inject_ngc_hub
cached = repo.get_all()
Exception: error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)

neal.vaidya · October 9, 2024, 3:30pm

Hi @manjunath.janardhan1 – if your system is behind a firewall you will likely need to follow the steps in the documentation for Serving Models from Local Assets.

manjunath.janardhan1 · October 9, 2024, 4:07pm

Thanks I tried but still same issue when tried inside the container! Is their a way to downlaod the cache on another machie which is not behind the firewall and copy it ?

nim@ad54971136af:/$ download-to-cache --profile 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5
INFO 2024-10-09 16:01:27.408 pre_download.py:80] Fetching contents for profile 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5
INFO 2024-10-09 16:01:27.409 pre_download.py:86] {
“feat_lora”: “false”,
“llm_engine”: “vllm”,
“precision”: “bf16”,
“tp”: “1”
}
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:117] One or more errors fetching files:
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
Repeats**

(https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
Traceback (most recent call last):
File “/opt/nim/llm/.venv/bin/download-to-cache”, line 6, in
sys.exit(download_to_cache())
File “/opt/nim/llm/vllm_nvext/hub/pre_download.py”, line 90, in download_to_cache
cached_files = repo.get_all()
Exception: error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
nim@ad54971136af:/$

parian17 · February 24, 2025, 3:11pm

I am having similar issues. When I do this command "curl --proxy https://api.ngc.nvidia.com I am getting a JSON response “statusCode”:“UNAUTHORIZED” which tells me that my machine can reach the api endpoint and my cert is valid. However, when I do download-to-cache --all I’m getting a cert error (InvalidCertificate(UnknownIssuer)). Is this a cert issue on my end?

sophwats · February 24, 2025, 5:50pm

Hi @parian17 the first error - UNAUTHORIZED - implies that your API key is not valid. Can you try generating a new API key?

parian17 · February 24, 2025, 6:18pm

Yes my API key is valid. I was just posting that to show that connectivity works between my container and the https://api.ngc.nvidia.com URL through the CURL command, but not working when running the command “download-to-cache” or “start_server.sh”.

parian17 · February 24, 2025, 8:54pm

Here is my exact error:

Overriding NIM_LOG_LEVEL: replacing NIM_LOG_LEVEL=unset with NIM_LOG_LEVEL=INFO
Running automatic profile selection: NIM_MANIFEST_PROFILE is not set
Selected profile: ab412b0239ed85250b13b6907a1e5efcee8c64d9a8bfd8f978482fbaa92660df
“timestamp”: “2025-02-24 20:48:53,233”, “level”: “INFO”, “message”: “Matched profile_id in manifest from env NIM_MODEL_PROFILE to: ab412b0239ed85250b13b6907a1e5efcee8c64d9a8bfd8f978482fbaa92660df”
“timestamp”: “2025-02-24 20:48:53,233”, “level”: “INFO”, “message”: “Using the profile specified by the user: ab412b0239ed85250b13b6907a1e5efcee8c64d9a8bfd8f978482fbaa92660df”
“timestamp”: “2025-02-24 20:48:53,233”, “level”: “INFO”, “message”: “Downloading manifest profile: ab412b0239ed85250b13b6907a1e5efcee8c64d9a8bfd8f978482fbaa92660df”
2025-02-24T20:48:53.235184Z INFO nim_hub_ngc::api::tokio::builder: ngc configured with api_loc: api.ngc.nvidia.com auth_loc: authn.nvidia.com scheme: https
2025-02-24T20:49:00.090580Z ERROR nim_sdk::hub::repo: One or more errors fetching files:
2025-02-24T20:49:00.090591Z ERROR nim_sdk::hub::repo: ConnectionError: Check your ability to access the remote source and any network/dns/firewall/proxy settings. Details: reqwest::Error { kind: Request, url: “”, source: hyper_util::client::legacy::Error(Connect, Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }) }

Topic		Replies	Views
NIM API key not Found Models nim , llama-31-8b-instruct , llama	4	363	September 21, 2024
Getting Started With NVIDIA NIM Tutorial Issues with NGC Registry Access/Accounts ubuntu , nim , llm , llama3-8b-instruct	7	1020	July 24, 2024
Unable to Run NIM on H100 GPU Due to Profile Compatibility Issue Despite Sufficient GPU Resources Models nim , llama-31-8b-instruct , llama	1	125	November 12, 2024
VIA Summarization Workflow ERROR Visual AI Agent llama	32	308	November 28, 2024
How to fix 0 compatible profiles? Where to get compatible profiles? Models nim , llama-31-8b-instruct , llama	4	360	November 26, 2024
NIM embedding model downloads but fails with auth error on startup Access/Accounts nim , nv-embedqa-e5-v5	27	475	February 11, 2025
Error while downloading VIA Visual AI Agent llama	20	253	September 23, 2024
NIM nim/meta/llama3-8b-instruct - no API key is detected NGC GPU Cloud	2	621	July 23, 2024
/opt/nim/start-server.sh: line 61: 32 Killed python3 -m vllm_nvext.entrypoints.openai.api_server Container: CUDA	0	230	July 9, 2024
Reusing a stored model (llama-3.1-8b-instruct) with a proper profile Models nim , llama-31-8b-instruct , llama	0	110	October 30, 2024

Running NIM llama-3_1-8b-instruct fails in On-Prem deployment

Related topics