@dusty_nv Ah, thanks a ton. I totally missed that. :)
Failed installing torchvision from source on Orin Nano with PyTorch=2.0 and torchvision=0.15.1
raise ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported")
ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported
Any suggestions on how to fix it.
Thank you,
Thanks for sharing the link. It works.
Are there any prebuilt wheels for Python with fbgemm or qnnpack? I need to use int8 for some experiments.
I am using a Jetson nano with JP 4.6.3.
I can also try to build it myself if someone can provide a link to any documentation.
Hi @pramodhrachuri, I donāt believe there are - when building the PyTorch wheels for JetPack 4, I had disabled QNNPACK because there are compilation errors at some point in the past (and seeing as QNNPACK is CPU-only I never really dug into it). If you want to try it, the general procedure that I followed for making the wheels can be found under the Build from Source
section at the top of this post.
Hi @dusty_nv
Iām working with a Xavier AGX, with Jetpack 5.1.1; R35.3.1
Iāve used the SDK Manager (1.9.2.10899) to install CUDA/CUDNN
I followed your steps to install torch (2.0.0) and torchvision (0.15.1) for that jetpack version.
All of the above installs, with no problems.
Iām running into issues on two fronts, where I am trying to install caffe from source as well as run some items with torch. Unfortunately I donāt have the caffe output handy just yet - my priority is to fix the torch environment. But with the torch environment, the general error code Iām getting is:
<class āRuntimeErrorā> cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
I installed CUDA 12.1 via:
I installed CUDNN 8.9.1.23 via:
sudo dpkg -i cudnn-local-repo-ubuntu2004-8.9.1.23_1.0-1_arm64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-8.9.1.23/cudnn-local-828249D0-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get install libcudnn8=8.9.1.23-1+cuda12.1
sudo apt-get install libcudnn8-dev=8.9.1.23-1+cuda12.1
sudo apt-get install libcudnn8-samples=8.9.1.23-1+cuda12.1
In my .bashrc, for CUDA I have added:
export LD_LIBRARY_PATH="/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH"
export CUDA_HOME="/usr/local/cuda-12.1"
export PATH="/usr/local/cuda-12.1/bin:$PATH"
export LD_PRELOAD="/usr/lib/aarch64-linux-gnu/libgomp.so.1"
I run your verification script, and everything is OK.
The code that triggers the error is:
#!/usr/bin/env python3
import torch
import torchvision
print(torch.__version__)
print('CUDA available: ' + str(torch.cuda.is_available()))
print('cuDNN version: ' + str(torch.backends.cudnn.version()))
a = torch.cuda.FloatTensor(2).zero_()
print('Tensor a = ' + str(a))
b = torch.randn(2).cuda()
print('Tensor b = ' + str(b))
c = a + b
print('Tensor c = ' + str(c))
print(torchvision.__version__)
torch.cuda.empty_cache()
device = torch.device('cuda')
torch.nn.functional.conv2d(torch.zeros(32, 32, 32, 32, device=device), torch.zeros(32, 32, 32, 32, device=device))
print("Success!")
I have been snooping around in /usr/local/cuda-12.1/lib64 and include, and donāt see the cudnn files that I would expect (no cudnn* or libcudnn*; if I follow the tar installation steps I know these get copied in, but I donāt know where they live via a .deb installation)
Relevant Stack Trace:
Traceback (most recent call last):
File "./cuda_torch_test.py", line 18, in <module>
torch.nn.functional.conv2d(torch.zeros(32, 32, 32, 32, device=device), torch.zeros(32, 32, 32, 32, device=device))
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
Do you have any advice?
Thank you
@claxtono these PyTorch wheels were built against the default version of CUDA/cuDNN that comes with JetPack, so you would need to recompile PyTorch if you install a different major version of CUDA/cuDNN. I would recommend just sticking with the default version that SDK Manager installed.
BTW, cuDNN typically gets installed under /usr/lib/aarch64-linux-gnu
and /usr/include/aarch64-linux-gnu
but I donāt know about the custom-upgraded ones.
Hi Dusty,
I reverted my CUDA distribution to 11.4 as packaged with jetpack.
I am still experiencing the same error code in the same place. Is there something else I can check?
Thanks