RuntimeError: Failed to dlopen libcuda.so.1 || Running Llama 3.3 70B

mark355 · February 16, 2025, 6:52am

I’m running into the following error when running Llama 3.3 70B NIM
The support matrix does not specify which NVIDIA GPUs the NIM is compatible with.

NIM Image: nvcr.io/nim/meta/llama-3.3-70b-instruct:1.5.2
Profile: tensorrt_llm-trtllm_buildable-bf16-tp4-pp1

Hardware:
Tried both A100-80GBx4 & H100x4 on GCP

Error stack trace:
Failed to import from vllm._C with ImportError(‘libcuda.so.1: cannot open shared object file: No such file or directory’)
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/nim_llm_sdk/entrypoints/launch.py”, line 99, in
main()
File “/opt/nim/llm/nim_llm_sdk/entrypoints/launch.py”, line 42, in main
inference_env = prepare_environment()
File “/opt/nim/llm/nim_llm_sdk/entrypoints/args.py”, line 204, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/nim_llm_sdk/hub/ngc_injector.py”, line 245, in inject_ngc_hub
system = get_hardware_spec()
File “/opt/nim/llm/nim_llm_sdk/hub/hardware_inspect.py”, line 313, in get_hardware_spec
gpus = GPUInspect()
File “/opt/nim/llm/nim_llm_sdk/hub/hardware_inspect.py”, line 66, in init
GPUInspect._safe_exec(cuda.cuInit(0))
File “cuda/cuda.pyx”, line 15991, in cuda.cuda.cuInit
File “cuda/ccuda.pyx”, line 17, in cuda.ccuda.cuInit
File “cuda/_cuda/ccuda.pyx”, line 2684, in cuda._cuda.ccuda._cuInit
File “cuda/_cuda/ccuda.pyx”, line 490, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so.1

sophwats · February 17, 2025, 12:37pm

Hi @mark355 the RuntimeError: Failed to dlopen libcuda.so.1 and Failed to import from vllm._C with ImportError(‘libcuda.so.1: cannot open shared object file: No such file or directory’) errors suggest that something is missing in the GPU Drivers.

Which service on GCP are you using? This repo gives examples and instructions for CSP installations GitHub - NVIDIA/nim-deploy: A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.. Let us know if you need more help please!

Thanks,

Sophie

Topic		Replies	Views
CUDA fail start. Local NIM Containers run failed CUDA Setup and Installation nim , llama-31-405b-instruct , llama	2	275	September 20, 2024
CUDA, Linux Ubuntu 10.04 and strange mismatch version CUDA Programming and Performance	26	19266	November 18, 2010
Cannot run any CUDA kernels CUDA runtime doesn't recognize NVIDIA GPU CUDA Programming and Performance	26	12624	August 24, 2010
cuda on ubuntu 10.04 CUDA Programming and Performance	34	7920	May 28, 2010
openSuse 11.3 CUDA Programming and Performance	0	4086	September 10, 2010
Install Problem CUDA Programming and Performance	32	12903	December 17, 2009
Help with cuda 3.1 on ubuntu 10.04 Lucid running cuda sdk 3.1 on ubuntu 10.04 Lucid CUDA Programming and Performance	4	13290	July 20, 2010
CODE WORKS IN XP BUT NOT LINUX. WHY? CUDA Programming and Performance	6	4678	June 21, 2008
container 'nvcr.io/nvidia/tensorflow:18.10-py3' has no tensorflow Deep Learning (Training & Inference)	0	778	November 15, 2019
CUDA drivers insufficient Frameworks (archived) tensorflow	31	2723	October 12, 2021

RuntimeError: Failed to dlopen libcuda.so.1 || Running Llama 3.3 70B

Related topics