PyTorch for Jetson

@dusty_nv Ah, thanks a ton. I totally missed that. :)

Failed installing torchvision from source on Orin Nano with PyTorch=2.0 and torchvision=0.15.1

    raise ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported")
ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported

Any suggestions on how to fix it.

Thank you,

Hi @akhilgurram.ai, please see this topic:

Thanks for sharing the link. It works.

Are there any prebuilt wheels for Python with fbgemm or qnnpack? I need to use int8 for some experiments.
I am using a Jetson nano with JP 4.6.3.

I can also try to build it myself if someone can provide a link to any documentation.

Hi @pramodhrachuri, I don’t believe there are - when building the PyTorch wheels for JetPack 4, I had disabled QNNPACK because there are compilation errors at some point in the past (and seeing as QNNPACK is CPU-only I never really dug into it). If you want to try it, the general procedure that I followed for making the wheels can be found under the Build from Source section at the top of this post.

Hi @dusty_nv
I’m working with a Xavier AGX, with Jetpack 5.1.1; R35.3.1
I’ve used the SDK Manager (1.9.2.10899) to install CUDA/CUDNN
I followed your steps to install torch (2.0.0) and torchvision (0.15.1) for that jetpack version.
All of the above installs, with no problems.

I’m running into issues on two fronts, where I am trying to install caffe from source as well as run some items with torch. Unfortunately I don’t have the caffe output handy just yet - my priority is to fix the torch environment. But with the torch environment, the general error code I’m getting is:
<class ‘RuntimeError’> cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

I installed CUDA 12.1 via:

I installed CUDNN 8.9.1.23 via:
sudo dpkg -i cudnn-local-repo-ubuntu2004-8.9.1.23_1.0-1_arm64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-8.9.1.23/cudnn-local-828249D0-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get install libcudnn8=8.9.1.23-1+cuda12.1
sudo apt-get install libcudnn8-dev=8.9.1.23-1+cuda12.1
sudo apt-get install libcudnn8-samples=8.9.1.23-1+cuda12.1

In my .bashrc, for CUDA I have added:

export LD_LIBRARY_PATH="/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH"
export CUDA_HOME="/usr/local/cuda-12.1"
export PATH="/usr/local/cuda-12.1/bin:$PATH"
export LD_PRELOAD="/usr/lib/aarch64-linux-gnu/libgomp.so.1"

I run your verification script, and everything is OK.
The code that triggers the error is:

#!/usr/bin/env python3
import torch
import torchvision

print(torch.__version__)
print('CUDA available: ' + str(torch.cuda.is_available()))
print('cuDNN version: ' + str(torch.backends.cudnn.version()))
a = torch.cuda.FloatTensor(2).zero_()
print('Tensor a = ' + str(a))
b = torch.randn(2).cuda()
print('Tensor b = ' + str(b))
c = a + b
print('Tensor c = ' + str(c))
print(torchvision.__version__)

torch.cuda.empty_cache()
device = torch.device('cuda')
torch.nn.functional.conv2d(torch.zeros(32, 32, 32, 32, device=device), torch.zeros(32, 32, 32, 32, device=device))

print("Success!")

I have been snooping around in /usr/local/cuda-12.1/lib64 and include, and don’t see the cudnn files that I would expect (no cudnn* or libcudnn*; if I follow the tar installation steps I know these get copied in, but I don’t know where they live via a .deb installation)

Relevant Stack Trace:

Traceback (most recent call last):
  File "./cuda_torch_test.py", line 18, in <module>
    torch.nn.functional.conv2d(torch.zeros(32, 32, 32, 32, device=device), torch.zeros(32, 32, 32, 32, device=device))
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Do you have any advice?

Thank you

@claxtono these PyTorch wheels were built against the default version of CUDA/cuDNN that comes with JetPack, so you would need to recompile PyTorch if you install a different major version of CUDA/cuDNN. I would recommend just sticking with the default version that SDK Manager installed.

BTW, cuDNN typically gets installed under /usr/lib/aarch64-linux-gnu and /usr/include/aarch64-linux-gnu but I don’t know about the custom-upgraded ones.

1 Like

Hi Dusty,

I reverted my CUDA distribution to 11.4 as packaged with jetpack.
I am still experiencing the same error code in the same place. Is there something else I can check?

Thanks