when inferencing the YOLOv8 model on the Jetson Xavier NX, the script crashes with the following error message.
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Thank you in advance for any advice that would help to solve this problem.
Software and Runtime versions nvidia-l4t-core 35.2.1-20230124153320
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Sun_Oct_23_22:16:07_PDT_2022 Cuda compilation tools, release 11.4, V11.4.315 Build cuda_11.4.r11.4/compiler.31964100_0
the problem was that I installed the torchvision from a pre-built wheel as recommended in the Ultralytics guide.
Pre-built Wheel providers like PyPI or Anaconda don’t include CUDA support packages, which is required, so the only way is to compile torchvision from a source with CUDA enabled.
If anyone in the future faces a similar problem and comes across this forum, here’s how to create a torchvision from source.
git clone https://github.com/pytorch/vision.git
cd vision
git checkout v0.16.1
#Build the CUDA version explicitly
export FORCE_CUDA=1
#Build the release version of torchvision
sudo python setup.py install # or python setup.py install --user as User-Specific Installation
#Build torchvision with all APIs
sudo python setup.py build develop
I can further confirm that Pytorch installed from wheel torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl is compatible with torchvision 0.16.1.
I also want to thank @dusty_nv , his comment helped me to understand what the problem is.