I'm facing a CUDA compatibility issue with my new RTX 5080 in deep learning

Description

"Hello, There!! I’m facing a CUDA compatibility issue with my new RTX 5080.

My environment can successfully recognize the GPU (e.g., torch.cuda.is_available() returns True), but any attempt to run a training operation fails with the following error:

Exception has occurred: RuntimeError

CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

File “C:\Users\Hyeyeoncho\Documents\My Codes\sac\SAC.py”, line 72, in forward x = F.relu(self.fc1(x)) ^^^^^^^^^^^^^^^^^^^ File “C:\Users\Hyeyeoncho\Documents\My Codes\sac\SAC.py”, line 82, in sample mean, log_std = self.forward(state) ^^^^^^^^^^^^^^^^^^^ File “C:\Users\Hyeyeoncho\Documents\My Codes\sac\SAC.py”, line 149, in select_action action, _ = self.actor.sample(state) ^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\Hyeyeoncho\Documents\My Codes\sac\SAC.py”, line 227, in main action = agent.select_action(state) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File “C:\Users\Hyeyeoncho\Documents\My Codes\sac\SAC.py”, line 249, in main() RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

It seems the pre-compiled kernels in my PyTorch build do not include support for my GPU’s architecture (sm_120). What is the recommended way to fix this? Any help would be appreciated."

Environment

GPU Type: NVIDIA GeForce RTX 5080
Nvidia Driver Version: 581.29
CUDA Version: 12.4
CUDNN Version: 90100
Operating System + Version: Windows 11 (Build 26100)
Python Version (if applicable): 3.11.13
PyTorch Version (if applicable): 2.6.0.dev20241112
TensorFlow Version (if applicable): N/A
TensorRT Version: Not Installed

Relevant Files

*** Please note that this reply is generated by LLM automatically ***

The error message you’re encountering, “CUDA error: no kernel image is available for execution on the device,” typically indicates that the CUDA kernel you’re trying to execute is not compatible with your GPU’s architecture. In your case, the issue is specifically related to the precompiled kernels in your PyTorch build not supporting your NVIDIA GeForce RTX 5080 GPU’s architecture (sm_120).

To resolve this issue, you have a few options:

  1. Compile PyTorch from source: One way to ensure that PyTorch supports your GPU’s architecture is to compile it from source. This process allows you to specify the CUDA architecture for which you want to compile PyTorch. You can follow the instructions on the PyTorch website to compile PyTorch from source.

  2. Use a pre-built PyTorch wheel that supports your GPU’s architecture: If compiling from source is not feasible, you can try to find a pre-built PyTorch wheel that supports your GPU’s architecture. You can check the PyTorch website or other repositories for pre-built wheels that support the sm_120 architecture.

  3. Use a different version of PyTorch or CUDA: If you’re using the latest version of PyTorch or CUDA, you might want to try using an earlier version that is known to support your GPU’s architecture.

  4. Use TensorRT: If you’re doing inference tasks, you might want to consider using TensorRT, which is a library for optimizing and deploying deep learning models. TensorRT supports a wide range of GPU architectures, including the sm_120 architecture.

In terms of the recommended way to fix this issue, compiling PyTorch from source is likely the most reliable solution. However, if you’re not comfortable compiling from source, using a pre-built PyTorch wheel that supports your GPU’s architecture is a good alternative.

Here are some additional recommendations:

  • Make sure that your CUDA and cuDNN versions are compatible with your PyTorch version.
  • Check that your GPU driver is up to date.
  • If you’re using a containerization platform like Docker, make sure that the container has the correct CUDA and cuDNN versions installed.
  • If you’re using a virtual environment, make sure that the virtual environment has the correct CUDA and cuDNN versions installed.

In comparison to other deep learning frameworks, PyTorch is known for its flexibility and ease of use. However, it can be more challenging to set up and configure than other frameworks like TensorFlow. TensorRT, on the other hand, is a specialized library for optimizing and deploying deep learning models, and it is known for its high performance and ease of use.

In terms of the environment, you’re using a Windows 11 machine with an NVIDIA GeForce RTX 5080 GPU, which is a powerful GPU that is well-suited for deep learning tasks. Your CUDA and cuDNN versions are up to date, and you’re using the latest version of PyTorch. However, the precompiled kernels in your PyTorch build do not support your GPU’s architecture, which is causing the error.

Overall, the recommended way to fix this issue is to compile PyTorch from source or use a pre-built PyTorch wheel that supports your GPU’s architecture. Additionally, making sure that your CUDA and cuDNN versions are compatible with your PyTorch version, checking that your GPU driver is up to date, and using a containerization platform or virtual environment with the correct CUDA and cuDNN versions installed can help to resolve the issue.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***