PyTorch CUDA Error on Jetson Orin Nano Super: "no kernel image is available for execution on the device"

Hello

I have successfully set up my Jetson Orin Nano Developer Super Kit and verified that PyTorch and CUDA are working. However, I am encountering the following error, and I can’t figure out the cause.

Device Information

  • JetPack Version : 6.2
  • PyTorch Version : 2.6.0+cu126
  • CUDA Version : 12.6
  • cuDNN Version : 90501

Python Code

import torch

print("PyTorch Version:", torch.__version__)
print("CUDA Version:", torch.version.cuda)
print("cuDNN Version:", torch.backends.cudnn.version())
print("CUDA Available:", torch.cuda.is_available()) 

if torch.cuda.is_available(): 
    device = torch.device("cuda")
    print("CUDA is available!")
    print("Device Name:", torch.cuda.get_device_name(0))  
    x = torch.tensor([1.0, 2.0, 3.0]).to(device)
    y = x * 2
    print(y)  
else:
    print("GPU is not available, running on CPU.")

Output

PyTorch Version: 2.6.0+cu126  
CUDA Version: 12.6  
cuDNN Version: 90501  
CUDA Available: True  
CUDA is available!  
Device Name: Orin  
Traceback (most recent call last):  
  File ~/Desktop/Proje/untitled0.py, line 20  
    y = x * 2  
RuntimeError: CUDA error: no kernel image is available for execution on the device  
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.  
For debugging consider passing CUDA_LAUNCH_BLOCKING=1  
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.  

Issue
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

How can I fix this issue? Do I need a custom PyTorch build for Jetson Orin Nano Super ?

Hi,

Could you share which PyTorch package you installed?

Usually, this error occurs when a library doesn’t be compiled with the correct GPU architecture.
For Orin, the GPU architecture is 8.7.

Thanks.

Hi,

I can share the package I installed:

 PyTorch - Cuda 12.6 => pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

However, from what I’ve read in other sources, it seems that 6.2 is not supported. I found that downgrading the package might be necessary depending on the JetPack version.

Hi,

We have provide PyTorch prebuilt for JetPack 6.2.
Please find it in the below link:

https://pypi.jetson-ai-lab.dev/jp6/cu126

Thanks.

Hi,
I would like to mention that I don’t have prior experience with this topic. Could you kindly assist me by specifying what I should download from the link you provided?

  • JetPack Version : 6.2
  • PyTorch Version : 2.6.0+cu126
  • CUDA Version : 12.6
  • cuDNN Version : 90501

Hi,

No kernel image error indicates the package doesn’t build with the corresponding GPU architecture. For the Orin series, sm is 8.7.
This is a common issue on Jetson because most of the third-party packages are built for the dGPU device which has different architectures.

As a result, we provide some common packages for the Jetsons community in the link shared above.
After setting up your device, please download the package you need and install it.

For example, please try below for PyTorch:

$ wget https://pypi.jetson-ai-lab.dev/jp6/cu126/+f/a86/1895294d90440/torch-2.6.0rc1-cp310-cp310-linux_aarch64.whl#sha256=a861895294d90440f2cdbd863d3fd5407fcc346f819665f3a63d90dfcf41a5b0
$ pip install torch-2.6.0rc1-cp310-cp310-linux_aarch64.whl
$ wget https://pypi.jetson-ai-lab.dev/jp6/cu126/+f/5f9/67f920de3953f/torchvision-0.20.0-cp310-cp310-linux_aarch64.whl#sha256=5f967f920de3953f2a39d95154b1feffd5ccc06b4589e51540dc070021a9adb9
$ pip install torchvision-0.20.0-cp310-cp310-linux_aarch64.whl

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.