CUDA 12.4 on Jetson Orin - CUDA driver/runtime API version mismatch

The Jetpack 6 available for Orin supports a CUDA runtime version only until 12.2.
This is confirmed by nvidia-smi.


+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.2.0                Driver Version: N/A          CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Per the excellently explained post here (Different CUDA versions shown by nvcc and NVIDIA-smi - Stack Overflow ), with the above CUDA driver, a CUDA runtime version greater than 12.2 is not guaranteed to work. However, NVIDIA’s CUDA download selection does provide the choices for Linux → aarch64-jetson → Native → Ubuntu → 22.04 for both CUDA 12.4 and 12.6.
CUDA Toolkit 12.4 Downloads | NVIDIA Developer
It’s not clear how these could ever work if the driver only supports up until 12.2 and there isn’t a higher Jetpack image available either for download or via ‘apt’ repository update post installation.

The reason I asked for 12.4 is that torch 2.4.0 has official pre-built wheel files for aarch64 only for CUDA 12.4 (and not for CUDA 12.2 or 12.1 or another version). Trying to run that with Jetpack 6 with the above CUDA driver API expectedly yields the error:
RuntimeError: GET was unable to find an engine to execute this computation
which is indicative of a mismatch between the CUDA version pytorch was compiled with (12.4) and the one supported by the CUDA driver. Note that I do have CUDA toolkit 12.4 installed, and so the CUDA 12.4 runtime itself is available, but this isn’t expected to work correctly because of the version of the CUDA driver.

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:23:12_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131

So, how are the CUDA 12.4 and 12.6 toolkits for ubuntu 22.04 on jetson orin nano expected to function? Thanks.

@uday1 there is a cuda-compat package included in those packages that enable the upgrade:

I think those PyTorch wheels are probably for aarch64 SBSA (server) instead of aarch64+igpu (Jetson). I have PyTorch 24 wheels for Jetson and CUDA 12.2 here: https://developer.download.nvidia.cn/compute/redist/jp/v60/pytorch/

1 Like

Thank you - this works perfectly. I used the package torch-2.4.0a0+3bcc3cddb5.nv24.07.16234504-cp310-cp310-linux_aarch64.whl. CUDA 12.2 (toolkit and driver API), torch 2.4.0, and Jetson Orin Nano is the combination here.

Do you also have a matching torchvision package that’s aligned with the above (matching CUDA runtime dependency)? Thanks.

Glad that you got it working @uday1! I don’t have a torchvision wheel built specifically against PyTorch 2.4, but I do have a torchvision 0.18 wheel that may work for you:

http://jetson.webredirect.org/jp6/cu122/torchvision/0.18.0a0+6043bc2

Thank you @dusty_nv. The torchvision version you link above, as well as the other official stable versions (0.17.0, 0.19.0) for aarch64 run into an issue when used with your torch 2.4.0 package:

$ python
Python 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/library.py", line 639, in register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/library.py", line 139, in _register_fake
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

I was able to get torchvision working though using an official nightly wheel of the 0.20.0 version, installed as follows:

pip install --force-reinstall torchvision==0.20.0.dev20240703 --no-deps --index-url https://download.pytorch.org/whl/nightly/cu124

This actually works with all the vision models I tried.

Finally, the only non-functional piece I see now is Torch Inductor (the default torch.compile backend). This doesn’t work because of a missing OpenAI triton installation for the Jetson.

  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 2653, in create_backend
    raise RuntimeError(
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
RuntimeError: Cannot find a working triton installation. More information on installing Triton can be found at https://github.com/openai/triton

There is no official triton wheel package I could find for the Jetson, and a build from sources on the Jetson itself is problematic since it often runs out of memory during the build (I’m trying this still by restricting build jobs to one). Do you happen to have a pre-built triton package? Thanks.

We’ve ack’ed you in a post that reported some performance using this torch wheel on a Jetson Orin Nano: PolyMage Labs on LinkedIn: PolyBlocks can now easily compile out of the box for an edge AI device as…
Thanks again!

There are triton wheels and containers for JetPack 6 here:

Oh cool thanks, like what you are doing! Thanks for supporting edge devices 👍

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.