CUDA 12.4 on Jetson Orin - CUDA driver/runtime API version mismatch

uday1 · August 13, 2024, 3:43am

The Jetpack 6 available for Orin supports a CUDA runtime version only until 12.2.
This is confirmed by nvidia-smi.


+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.2.0                Driver Version: N/A          CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Per the excellently explained post here (Different CUDA versions shown by nvcc and NVIDIA-smi - Stack Overflow ), with the above CUDA driver, a CUDA runtime version greater than 12.2 is not guaranteed to work. However, NVIDIA’s CUDA download selection does provide the choices for Linux → aarch64-jetson → Native → Ubuntu → 22.04 for both CUDA 12.4 and 12.6.
CUDA Toolkit 12.4 Downloads | NVIDIA Developer
It’s not clear how these could ever work if the driver only supports up until 12.2 and there isn’t a higher Jetpack image available either for download or via ‘apt’ repository update post installation.

The reason I asked for 12.4 is that torch 2.4.0 has official pre-built wheel files for aarch64 only for CUDA 12.4 (and not for CUDA 12.2 or 12.1 or another version). Trying to run that with Jetpack 6 with the above CUDA driver API expectedly yields the error:
RuntimeError: GET was unable to find an engine to execute this computation
which is indicative of a mismatch between the CUDA version pytorch was compiled with (12.4) and the one supported by the CUDA driver. Note that I do have CUDA toolkit 12.4 installed, and so the CUDA 12.4 runtime itself is available, but this isn’t expected to work correctly because of the version of the CUDA driver.

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:23:12_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131

So, how are the CUDA 12.4 and 12.6 toolkits for ubuntu 22.04 on jetson orin nano expected to function? Thanks.

dusty_nv · August 13, 2024, 4:36am

@uday1 there is a cuda-compat package included in those packages that enable the upgrade:

I think those PyTorch wheels are probably for aarch64 SBSA (server) instead of aarch64+igpu (Jetson). I have PyTorch 24 wheels for Jetson and CUDA 12.2 here: https://developer.download.nvidia.cn/compute/redist/jp/v60/pytorch/

uday1 · August 14, 2024, 5:37am

Thank you - this works perfectly. I used the package torch-2.4.0a0+3bcc3cddb5.nv24.07.16234504-cp310-cp310-linux_aarch64.whl. CUDA 12.2 (toolkit and driver API), torch 2.4.0, and Jetson Orin Nano is the combination here.

Do you also have a matching torchvision package that’s aligned with the above (matching CUDA runtime dependency)? Thanks.

dusty_nv · August 15, 2024, 9:35pm

Glad that you got it working @uday1! I don’t have a torchvision wheel built specifically against PyTorch 2.4, but I do have a torchvision 0.18 wheel that may work for you:

http://jetson.webredirect.org/jp6/cu122/torchvision/0.18.0a0+6043bc2

uday1 · August 16, 2024, 6:41am

Thank you @dusty_nv. The torchvision version you link above, as well as the other official stable versions (0.17.0, 0.19.0) for aarch64 run into an issue when used with your torch 2.4.0 package:

$ python
Python 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/library.py", line 639, in register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/library.py", line 139, in _register_fake
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

I was able to get torchvision working though using an official nightly wheel of the 0.20.0 version, installed as follows:

pip install --force-reinstall torchvision==0.20.0.dev20240703 --no-deps --index-url https://download.pytorch.org/whl/nightly/cu124

This actually works with all the vision models I tried.

Finally, the only non-functional piece I see now is Torch Inductor (the default torch.compile backend). This doesn’t work because of a missing OpenAI triton installation for the Jetson.

  File "/home/uday/.venv/py3.10/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 2653, in create_backend
    raise RuntimeError(
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
RuntimeError: Cannot find a working triton installation. More information on installing Triton can be found at https://github.com/openai/triton

There is no official triton wheel package I could find for the Jetson, and a build from sources on the Jetson itself is problematic since it often runs out of memory during the build (I’m trying this still by restricting build jobs to one). Do you happen to have a pre-built triton package? Thanks.

uday1 · August 16, 2024, 5:15pm

We’ve ack’ed you in a post that reported some performance using this torch wheel on a Jetson Orin Nano: PolyMage Labs on LinkedIn: PolyBlocks can now easily compile out of the box for an edge AI device as…
Thanks again!

dusty_nv · August 22, 2024, 3:50am

There are triton wheels and containers for JetPack 6 here:

jp6/cu122/: triton-3.0.0 metadata and description
dustynv/openai-triton:r36.3.0

Oh cool thanks, like what you are doing! Thanks for supporting edge devices 👍

system · September 10, 2024, 8:33am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cannot get Torch met CUDA to work on Jetson Orin Jetson Orin NX pytorch	10	1463	February 26, 2024
Regarding the issue of CUDA and Pytorch Jetson Nano cuda , pytorch	3	52	November 21, 2024
Incompatible torch2.2+Cuda12.2 wheel with other python libraries for AGX Orin Jetpack6.0 Jetson AGX Orin cuda , pytorch	9	908	May 17, 2024
Jetson orin nano Cuda Cudnn torch torchauido torchvision Jetson Orin Nano cuda	11	668	May 27, 2024
Pytorch installed on l4t-jetpack:r35.4.1 container on Jetson Orin Nano (JetPack 6.0 Developer Kit) fails to recognize CUDA Jetson Orin Nano cuda , docker , pytorch , python , containers	2	52	October 22, 2024
The detected CUDA version (11.5) mismatches the version that was used to compile PyTorch (12.2) Jetson Orin Nano cuda , pytorch	5	3400	April 24, 2024
Pytorch with CUDA support Jetson Orin Nano Jetson Orin Nano cuda , pytorch , python , cudnn	4	594	May 14, 2024
Pytorch compatibility issues (torch 2.0.0+nv23.5 && torchvision 0.15.1) Jetson Orin NX pytorch	10	14466	June 13, 2023
CUDA 12 : Insufficient driver version on AGX Orin Jetson AGX Orin cuda , nvbugs	13	3685	March 23, 2023
Install Pytorch with cuda on Jetson Orin nano Devloper Kit Jetson Orin Nano cuda , pytorch	13	1261	July 30, 2024

CUDA 12.4 on Jetson Orin - CUDA driver/runtime API version mismatch

Related topics