CUDA fail start. Local NIM Containers run failed

pavlo3 · September 19, 2024, 10:49am

Run this model locally on Ubuntu 24.04.1 LTS with
NVIDIA GeForce RTX 3060 Ti
NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2
Starting with
sudo docker run nvcr.io/nim/meta/llama-3.1-405b-instruct
Leads to:

2024-09-19 10:42:55,094 [INFO] PyTorch version 2.3.1 available.
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/vllm_nvext/entrypoints/launch.py”, line 99, in
main()
File “/opt/nim/llm/vllm_nvext/entrypoints/launch.py”, line 42, in main
inference_env = prepare_environment()
File “/opt/nim/llm/vllm_nvext/entrypoints/args.py”, line 154, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/vllm_nvext/hub/ngc_injector.py”, line 190, in inject_ngc_hub
system = get_hardware_spec()
File “/opt/nim/llm/vllm_nvext/hub/hardware_inspect.py”, line 285, in get_hardware_spec
gpus = GPUInspect()
File “/opt/nim/llm/vllm_nvext/hub/hardware_inspect.py”, line 93, in init
GPUInspect._safe_exec(cuda.cuInit(0))
File “cuda/cuda.pyx”, line 15966, in cuda.cuda.cuInit
File “cuda/ccuda.pyx”, line 17, in cuda.ccuda.cuInit
File “cuda/_cuda/ccuda.pyx”, line 2684, in cuda._cuda.ccuda._cuInit
File “cuda/_cuda/ccuda.pyx”, line 490, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so.1

Then I google for a while, find some recommendations, and try another run.

Starting with
sudo docker run --rm --runtime=nvidia --gpus all nvcr.io/nim/meta/llama-3.1-405b-instruct
Leads to:

2024-09-19 10:34:08,645 [INFO] PyTorch version 2.3.1 available.
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/vllm_nvext/entrypoints/launch.py”, line 99, in
main()
File “/opt/nim/llm/vllm_nvext/entrypoints/launch.py”, line 42, in main
inference_env = prepare_environment()
File “/opt/nim/llm/vllm_nvext/entrypoints/args.py”, line 154, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/vllm_nvext/hub/ngc_injector.py”, line 190, in inject_ngc_hub
system = get_hardware_spec()
File “/opt/nim/llm/vllm_nvext/hub/hardware_inspect.py”, line 285, in get_hardware_spec
gpus = GPUInspect()
File “/opt/nim/llm/vllm_nvext/hub/hardware_inspect.py”, line 93, in init
GPUInspect._safe_exec(cuda.cuInit(0))
File “/opt/nim/llm/vllm_nvext/hub/hardware_inspect.py”, line 101, in _safe_exec
raise RuntimeError(f"Unexpected error: {status.name}")
RuntimeError: Unexpected error: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE

Now I stuck. Please advise how to fix?

Robert_Crovella · September 19, 2024, 2:15pm

Update the GPU driver on the base machine to the latest available for your GPU. And its entirely possible that the llama-3.1-405b-instruct container won’t run on your GPU, although that isn’t the error you have run into, yet.

pavlo3 · September 20, 2024, 10:03am

Thank you Robert,
Just for your Knowledge Base
The 550 Drivers do not works properly on Ubuntu 24.04 with RTX 3060 Ti.
Downgraded to Ubuntu 22.04 and 550 Drivers with CUDA 12.4 became available.
I using another smaller model as you suggest the current one cannot run on my hardware.
The container successfully run and download additional data.
After I got this error
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU
The total free memory is 6+ GB. And I confused why just 224 cannot be allocated?
How to fix this?

Topic		Replies	Views
NVIDIA NIM Container with CUDA out of Memory Problem Docker and NVIDIA Docker cuda , ubuntu , docker , nim , llama3-8b-instruct	2	471	September 20, 2024
/opt/nim/start-server.sh: line 61: 32 Killed python3 -m vllm_nvext.entrypoints.openai.api_server Container: CUDA	0	254	July 9, 2024
How to enable CUDA Minor Version Compatibility for/in a nvidia/cuda docker image Container: CUDA	0	51	April 14, 2025
Instaling cuda 12.5 i have 12.3 CUDA Setup and Installation	2	577	June 20, 2024
RuntimeError: CUDA error: no kernel image is available for execution on the device on RTX 3060 Linux	3	4212	July 18, 2022
GH200 Cuda not available on pytorch CUDA Programming and Performance	4	1088	April 2, 2024
Reusing a stored model (llama-3.1-8b-instruct) with a proper profile Models nim , llama-31-8b-instruct , llama	0	144	October 30, 2024
CUDA driver version is insufficient for CUDA runtime version CUDA Setup and Installation	7	32106	May 18, 2024
Mismatch in CUDA driver and runtime versions CUDA Setup and Installation ubuntu	6	1399	September 3, 2024
Cuda Installer error for Linux Ubuntu 18.04 x86_64 CUDA Setup and Installation	2	2493	March 11, 2020

CUDA fail start. Local NIM Containers run failed

Run this model locally on Ubuntu 24.04.1 LTS with NVIDIA GeForce RTX 3060 Ti NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 Starting with sudo docker run nvcr.io/nim/meta/llama-3.1-405b-instruct Leads to:

Starting with sudo docker run --rm --runtime=nvidia --gpus all nvcr.io/nim/meta/llama-3.1-405b-instruct Leads to:

Related topics

Run this model locally on Ubuntu 24.04.1 LTS with
NVIDIA GeForce RTX 3060 Ti
NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2
Starting with
sudo docker run nvcr.io/nim/meta/llama-3.1-405b-instruct
Leads to:

Starting with
sudo docker run --rm --runtime=nvidia --gpus all nvcr.io/nim/meta/llama-3.1-405b-instruct
Leads to: