Trying to deploy llama-3_1-8b-instruct NIM container for Inferencing On-Prem.
Am getting below error. It is behind the firewall but I was able to use huggingface_cli to download the model as proxy settings are inplace. I was able to pull the NIM but while running it is trying to download some files and it is failing. Need to help.
INFO 2024-10-09 08:42:35.718 ngc_injector.py:206] Selected profile: 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5 (vllm-bf16-tp1)
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: feat_lora: false
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: llm_engine: vllm
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: precision: bf16
INFO 2024-10-09 08:42:35.719 ngc_injector.py:214] Profile metadata: tp: 1
INFO 2024-10-09 08:42:35.719 ngc_injector.py:245] Preparing model workspace. This step might download additional files to run the model.
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:117] One or more errors fetching files:
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url Repeats***
[10-09 08:42:49.652 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/vllm_nvext/entrypoints/openai/api_server.py”, line 654, in
inference_env = prepare_environment()
File “/opt/nim/llm/vllm_nvext/entrypoints/args.py”, line 155, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/vllm_nvext/hub/ngc_injector.py”, line 247, in inject_ngc_hub
cached = repo.get_all()
Exception: error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
Thanks I tried but still same issue when tried inside the container! Is their a way to downlaod the cache on another machie which is not behind the firewall and copy it ?
nim@ad54971136af:/$ download-to-cache --profile 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5
INFO 2024-10-09 16:01:27.408 pre_download.py:80] Fetching contents for profile 3bb4e8fe78e5037b05dd618cebb1053347325ad6a1e709e0eb18bb8558362ac5
INFO 2024-10-09 16:01:27.409 pre_download.py:86] {
“feat_lora”: “false”,
“llm_engine”: “vllm”,
“precision”: “bf16”,
“tp”: “1”
}
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:117] One or more errors fetching files:
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files)
[10-09 16:03:23.453 ERROR nim_sdk::hub::repo rust/nim-sdk/src/hub/repo.rs:119] error sending request for url (https://api.ngc.nvidia.com/v2/org/nim/team/meta/models/llama-3_1-8b-instruct/hf-8c22764-nim1.2/files) Repeats**
I am having similar issues. When I do this command "curl --proxy https://api.ngc.nvidia.com I am getting a JSON response “statusCode”:“UNAUTHORIZED” which tells me that my machine can reach the api endpoint and my cert is valid. However, when I do download-to-cache --all I’m getting a cert error (InvalidCertificate(UnknownIssuer)). Is this a cert issue on my end?
Yes my API key is valid. I was just posting that to show that connectivity works between my container and the https://api.ngc.nvidia.com URL through the CURL command, but not working when running the command “download-to-cache” or “start_server.sh”.