Help fix installation error nvidia-driver,cuda,cudnn,torch,tensorrt suitable for Ubuntu20.04 x86_64

TensorRT Version: 8.6.1.6-1+cuda11.8
Nvidia Driver Version: 470.256.02
CUDA Version: 11.8
CUDNN Version: 8.6.0.163-1+cuda11.8
Operating System + Version: Ubuntu20.04 x86_64
Python Version (if applicable): 3.8.19
PyTorch Version (if applicable): 2.4.0+cu118

Those are the specs I have on my computer (NVIDIA Corporation TU104GL [Tesla T4]). I also tried torch.cuda.is_available ==> True

However I get an error when running my python application:

  • File /lib/python3.8/site-packages/torch/cuda/init.py", line 314, in _lazy_init
    torch._C._cuda_init()
    RuntimeError: CUDA error: initialization error
    Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

  • [error][cuda_tools.cpp:25]:CUDA Runtime error cudaSetDevice(device_id) # initialization error, code = cudaErrorInitializationError [ 3 ] in file /face_detect_module/cpp/tensorRT/infer/trt_infer.cpp:444

  • [error][cuda_tools.cpp:25]:CUDA Runtime error cudaStreamCreate(&stream_) # initialization error, code = cudaErrorInitializationError [ 3 ] in file /face_detect_module/cpp/tensorRT/infer/trt_infer.cpp:66

I wonder if I installed the wrong versions of cuda,tensorrt,torch? Please give me a solution, thanks

Hi
Try to create a CUDA coredump in a rerun and attach cuda-gdb to it to see which kernel really fails
CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1 CUDA_ENABLE_LIGHTWEIGHT_COREDUMP=1
CUDA_COREDUMP_SHOW_PROGRESS=1

But for the best support please reach out to Cuda Forum.

Thanks