RuntimeError: Failed to dlopen libcuda.so.1 || Running Llama 3.3 70B

I’m running into the following error when running Llama 3.3 70B NIM
The support matrix does not specify which NVIDIA GPUs the NIM is compatible with.

NIM Image: nvcr.io/nim/meta/llama-3.3-70b-instruct:1.5.2
Profile: tensorrt_llm-trtllm_buildable-bf16-tp4-pp1

Hardware:
Tried both A100-80GBx4 & H100x4 on GCP

Error stack trace:
Failed to import from vllm._C with ImportError(‘libcuda.so.1: cannot open shared object file: No such file or directory’)
Traceback (most recent call last):
File “/usr/lib/python3.10/runpy.py”, line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.10/runpy.py”, line 86, in _run_code
exec(code, run_globals)
File “/opt/nim/llm/nim_llm_sdk/entrypoints/launch.py”, line 99, in
main()
File “/opt/nim/llm/nim_llm_sdk/entrypoints/launch.py”, line 42, in main
inference_env = prepare_environment()
File “/opt/nim/llm/nim_llm_sdk/entrypoints/args.py”, line 204, in prepare_environment
engine_args, extracted_name = inject_ngc_hub(engine_args)
File “/opt/nim/llm/nim_llm_sdk/hub/ngc_injector.py”, line 245, in inject_ngc_hub
system = get_hardware_spec()
File “/opt/nim/llm/nim_llm_sdk/hub/hardware_inspect.py”, line 313, in get_hardware_spec
gpus = GPUInspect()
File “/opt/nim/llm/nim_llm_sdk/hub/hardware_inspect.py”, line 66, in init
GPUInspect._safe_exec(cuda.cuInit(0))
File “cuda/cuda.pyx”, line 15991, in cuda.cuda.cuInit
File “cuda/ccuda.pyx”, line 17, in cuda.ccuda.cuInit
File “cuda/_cuda/ccuda.pyx”, line 2684, in cuda._cuda.ccuda._cuInit
File “cuda/_cuda/ccuda.pyx”, line 490, in cuda._cuda.ccuda.cuPythonInit
RuntimeError: Failed to dlopen libcuda.so.1

Hi @mark355 the RuntimeError: Failed to dlopen libcuda.so.1 and Failed to import from vllm._C with ImportError(‘libcuda.so.1: cannot open shared object file: No such file or directory’) errors suggest that something is missing in the GPU Drivers.

Which service on GCP are you using? This repo gives examples and instructions for CSP installations GitHub - NVIDIA/nim-deploy: A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.. Let us know if you need more help please!

Thanks,

Sophie