Incompatible torch2.2+Cuda12.2 wheel with other python libraries for AGX Orin Jetpack6.0

lfermoselle · May 15, 2024, 3:54pm

I’m trying to install torch2.2 with cuda enabled (12.2 cuda) on my AGX Orin Developer Kit. I have followed the instructions here and have installed torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl successfully. However, I also need to install other packages which depend on a torch installation (versions selected based on known compatibility with torch2.2):

torchvision==0.17.0
torchaudio==2.2.0,
kornia==0.7.1
lightning==2.2.0

The issue I encounter is that when I install any of these packages, a new wheel for torch==2.2.0 without CUDA is downloaded and installed as a required dependency, effectively overwriting my previous installation of torch==2.2.0a0+81ea7a4 and making me lose CUDA access.

How can I successfully install my desired packages alongside the provided wheel torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl and have CUDA enabled? Which versions, or specific wheels should I use?

System specifications:
Jetpack 6.0
L4T version - R36.2.0
Cuda 12.2

i_love_nvidia · May 15, 2024, 6:14pm

You have to build it from source.

dusty_nv · May 15, 2024, 6:21pm

Or re-install the GPU version of the PyTorch wheel after. Wheels for torchvision and torchaudio have also been posted here:

Keeping all the versions straight and tested, with CUDA enabled, is one of the reasons for jetson-containers projects, but you can manually install the wheels that it built if you don’t wish to use Docker.

lfermoselle · May 15, 2024, 6:44pm

Thanks @dusty_nv! I have actually tried those torch2.3 wheels (including torchaudio and torchvision) but installing this combination of wheels did not succeed in having CUDA enabled. Also, since I need to use the lightning library and this is not compatible with torch2.3 I do need to keep torch2.2 version. Are there torchvision and torchaudio wheels available as well for torch==2.2.0a0+81ea7a4 version?

Re containers: Would you be able to point me to the container in that project which matches my system requirements (cuda12.2) and keeps torch2.2 version on all torch wheels? I could not find one there…

dusty_nv · May 16, 2024, 2:18pm

Hi @lfermoselle, yep you can find the previous wheels for torchaudio and torchvision here:

And the l4t-pytorch container comes with PyTorch 2.2, torchaudio 2.2, and torchvision 0.17:

dustynv/l4t-pytorch:r36.2.0
jetson-containers/packages/l4t/l4t-pytorch at master · dusty-nv/jetson-containers · GitHub

These containers are built to use that pip server from above, so when you pip3 install lightning on top, it should automatically keep/install my CUDA-enabled torch wheels instead of pulling the CPU-only ones from PyPi.

lfermoselle · May 16, 2024, 2:58pm

Thank you so much @dusty_nv! I have tried the container you provided dustynv/l4t-pytorch:r36.2.0 and I see the following wheels installed:

torch2.2.0
torchvision0.17.2+c1d70fe
torchaudio2.2.2+cefdb36

It seems torchvision and torchaudio wheels have CUDA-enabled but the main torch wheel is missing cuda. I have tested your image fresh and have not installed any of my desired python libraries there yet. Is this the expected behaviour? Should I still install the torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl wheel in this image myself?

dusty_nv · May 16, 2024, 7:00pm

No you should not need to, it is strange because the same container image here reports torch2.2.0 as having CUDA enabled in the tests, on different Jetsons with R36.2 and R36.3…

For example, this is the output I get from running this torch sanity test script on R36.3 using the dustynv/l4t-pytorch:r36.2.0 image (which was built on another machine):

testing PyTorch...
PyTorch version: 2.2.0
CUDA available:  True
cuDNN version:   8904
PyTorch built with:
  - GCC 11.4
  - C++ Version: 201703
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - CUDA Runtime 12.2
  - NVCC architecture flags: -gencode;arch=compute_87,code=sm_87
  - CuDNN 8.9.4
  - Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=12.2, CUDNN_VERSION=8.9.4, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

PACKAGING_VERSION=2.2.0
TORCH_CUDA_ARCH_LIST=8.7

/mount/jetson-containers/dev/packages/pytorch/test.py:23: UserWarning: The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such as torch.tensor(data, dtype=*, device='cuda') to create tensors. (Triggered internally at /opt/pytorch/torch/csrc/tensor/python_tensor.cpp:83.)
  a = torch.cuda.FloatTensor(2).zero_()
Tensor a = tensor([0., 0.], device='cuda:0')
Tensor b = tensor([-0.1736,  1.9561], device='cuda:0')
Tensor c = tensor([-0.1736,  1.9561], device='cuda:0')
testing LAPACK (OpenBLAS)...
done testing LAPACK (OpenBLAS)
testing torch.nn (cuDNN)...
done testing torch.nn (cuDNN)
testing CPU tensor vector operations...
/mount/jetson-containers/dev/packages/pytorch/test.py:62: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  cpu_y = F.softmax(cpu_x)
Tensor cpu_x = tensor([12.3450])
Tensor softmax = tensor([1.])
Tensor exp (float32) = tensor([[2.7183, 2.7183, 2.7183],
        [2.7183, 2.7183, 2.7183],
        [2.7183, 2.7183, 2.7183]])
Tensor exp (float64) = tensor([[2.7183, 2.7183, 2.7183],
        [2.7183, 2.7183, 2.7183],
        [2.7183, 2.7183, 2.7183]], dtype=torch.float64)
Tensor exp (diff) = 7.429356050359104e-07
PyTorch OK

lfermoselle · May 16, 2024, 9:52pm

Thank you for the quick reply and pointing out this test script @dusty_nv. It seems the container is having trouble finding the CUDA driver on my system. Here is the output of running your test script after pulling your container with docker pull dustynv/l4t-pytorch:r36.2.0, starting with docker run --name test -it dustynv/l4t-pytorch:r36.2.0 and typing python3, on my Jetson R36.2:

print('testing PyTorch...')                                                                                                               
testing PyTorch...                                                                                                                            
                                                                                                                                  
import torch                                                                                                                              
                                                                                                                                          
print('PyTorch version: ' + str(torch.__version__))                                                                                       
PyTorch version: 2.2.0                                                                                                                        
print('CUDA available:  ' + str(torch.cuda.is_available()))                                                                               
CUDA available:  False                                                                                                                        
print('cuDNN version:   ' + str(torch.backends.cudnn.version()))                                                                          
cuDNN version:   8904                                                                                                                         
                                                                                                                                          
print(torch.__config__.show())                                                                                                            
PyTorch built with:                                                                                                                           
  - GCC 11.4                                                                                                                                  
  - C++ Version: 201703                                                                                                                       
  - OpenMP 201511 (a.k.a. OpenMP 4.5)                                                                                                         
  - LAPACK is enabled (usually provided by MKL)                                                                                               
  - NNPACK is enabled                                                                                                                         
  - CPU capability usage: NO AVX                                                                                                              
  - Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=12.2, CUDNN_VERSION=8.9.4, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -D_GLIBC
XX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_XNNPACK -DSYMBOLICATE_MOB
ILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wna
rrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wn
o-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=
old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-er
rno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=2.2.0, USE_CUDA=ON, U
SE_CUDNN=ON, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, US
E_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,                                                                         
                                                                                                                                              
                                                                                                                                          
# fail if CUDA isn't available                                                                                                            
assert(torch.cuda.is_available())                                                                                                         
Traceback (most recent call last):                                                                                                            
  File "<stdin>", line 1, in <module>                                                                                                         
AssertionError                                                                                                                                
                                                                                                                                          
# check that version can be parsed                                                                                                        
from packaging import version                                                                                                             
from os import environ                                                                                                                    
                                                                                                                                          
print('PACKAGING_VERSION=' + str(version.parse(torch.__version__)))                                                                       
PACKAGING_VERSION=2.2.0                                                                                                                       
print('TORCH_CUDA_ARCH_LIST=' + environ.get('TORCH_CUDA_ARCH_LIST', 'None') + '\n')                                                       
TORCH_CUDA_ARCH_LIST=8.7                                                                                                       
                                                                                                                                  
                                                                                                                           
# quick cuda tensor test                                                                                                      
a = torch.cuda.FloatTensor(2).zero_()                                                                                      
<stdin>:1: UserWarning: The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such as torch.tensor(data
, dtype=*, device='cuda') to create tensors. (Triggered internally at /opt/pytorch/torch/csrc/tensor/python_tensor.cpp:83.)     
Traceback (most recent call last): 
  File "<stdin>", line 1, in <module>                                                                                                         
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
    torch._C._cuda_init()                                                                                                         
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx                                                                                                             
print('Tensor a = ' + str(a))                                         
Traceback (most recent call last):                                                                                                            
  File "<stdin>", line 1, in <module>                          
NameError: name 'a' is not defined                              
                                                           
b = torch.randn(2).cuda()                                             
Traceback (most recent call last):                                        
  File "<stdin>", line 1, in <module>                                     
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
    torch._C._cuda_init()                                              
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx                                                     
print('Tensor b = ' + str(b))                                         
Traceback (most recent call last):                                     
  File "<stdin>", line 1, in <module>                                     
NameError: name 'b' is not defined                                        
                                                                      
c = a + b                                                             
Traceback (most recent call last):                                        
  File "<stdin>", line 1, in <module>                                     
NameError: name 'a' is not defined                                        
print('Tensor c = ' + str(c))                                         
Traceback (most recent call last):                                        
  File "<stdin>", line 1, in <module>
NameError: name 'c' is not defined                                        
                                                                      
# LAPACK test                                                         
print('testing LAPACK (OpenBLAS)...')                                 
testing LAPACK (OpenBLAS)...                                              
                                                                      
a = torch.randn(2, 3, 1, 4, 4)                                        
b = torch.randn(2, 3, 1, 4, 4)                                        
                                                                      
x, lu = torch.linalg.solve(b, a)                                      
                                                                      
print('done testing LAPACK (OpenBLAS)')                               
done testing LAPACK (OpenBLAS)                                            
                                                                      
# torch.nn test                                                    
print('testing torch.nn (cuDNN)...')                                  
testing torch.nn (cuDNN)...                                            
                                                                      
import torch.nn                                                       
                                                                   
model = torch.nn.Conv2d(3,3,3)                                        
data = torch.zeros(1,3,10,10)                                         
model = model.cuda()                                               
Traceback (most recent call last):                                        
  File "<stdin>", line 1, in <module>                                     
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 911, in cuda
    return self._apply(lambda t: t.cuda(device))                                                                                              
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)                                          
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 911, in <lambda>
    return self._apply(lambda t: t.cuda(device))                                                                                              
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
    torch._C._cuda_init()                                              
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
data = data.cuda()                                                 
Traceback (most recent call last):                                     
  File "<stdin>", line 1, in <module>                                  
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
    torch._C._cuda_init()                                              
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
out = model(data)                                                  
                                                                   
#print(out)                                                        
                                                                   
print('done testing torch.nn (cuDNN)')                                                                                                    
done testing torch.nn (cuDNN)                                          
                                                                   
# CPU test (https://github.com/pytorch/pytorch/issues/47098)                                                                              
print('testing CPU tensor vector operations...')                                                                                          
testing CPU tensor vector operations...                                                                                                       
                                                                   
import torch.nn.functional as F                                    
cpu_x = torch.tensor([12.345])                                     
cpu_y = F.softmax(cpu_x)                                           
<stdin>:1: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
                                                                   
print('Tensor cpu_x = ' + str(cpu_x))                                                                                                     
Tensor cpu_x = tensor([12.3450])                                       
print('Tensor softmax = ' + str(cpu_y))                                                                                                   
Tensor softmax = tensor([1.])                                          
                                                                   
if cpu_y != 1.0:                                                   
...     raise ValueError('PyTorch CPU tensor vector test failed (softmax)\n')
...                                                                    
# https://github.com/pytorch/pytorch/issues/61110                                                                                         
t_32 = torch.ones((3,3), dtype=torch.float32).exp()                                                                                       
t_64 = torch.ones((3,3), dtype=torch.float64).exp()                                                                                       
diff = (t_32 - t_64).abs().sum().item()                                                                                                   
                                                                   
print('Tensor exp (float32) = ' + str(t_32))                                                                                              
Tensor exp (float32) = tensor([[2.7183, 2.7183, 2.7183],                                                                                      
        [2.7183, 2.7183, 2.7183],                                      
        [2.7183, 2.7183, 2.7183]])                                     
print('Tensor exp (float64) = ' + str(t_64))                                                                                              
Tensor exp (float64) = tensor([[2.7183, 2.7183, 2.7183],                                                                                      
        [2.7183, 2.7183, 2.7183],                                      
        [2.7183, 2.7183, 2.7183]], dtype=torch.float64)                                                                                       
print('Tensor exp (diff) = ' + str(diff))                                                                                                 
Tensor exp (diff) = 7.429356050359104e-07                                                                                                     
                                                                   
if diff > 0.1:                                                     
...     raise ValueError(f'PyTorch CPU tensor vector test failed (exp, diff={diff})')
...                                                                    
... print('PyTorch OK\n')

Outside the container, when I run nvcc --version on my AGX Orin I get the following:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:08:11_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

My system Jetson version is: nvidia-l4t-core 36.2.0-20231218214829

I have been able to install torch2.2+cuda before using the wheel torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl, so I believe my system has the a valid CUDA driver? Am I missing some configuration step to be able to use your container with torch2.2+cuda?

Thank you for all your help with this issue!

lfermoselle · May 17, 2024, 6:38pm

I have solved my issue above, I was just missing cuda on my PATH variable. Adding export PATH=${CUDA_HOME}/bin:${PATH} fixed it - I have now torch2.2, torchvision and torchaudio correctly installed with cuda enabled. Thank you for this image @dusty_nv!

system · June 18, 2024, 10:40am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pytorch compatibility issues (torch 2.0.0+nv23.5 && torchvision 0.15.1) Jetson Orin NX pytorch	10	14900	June 13, 2023
"torch.cuda" is not available inside the docker container which is running on Jetson Orin Jetson Orin Nano cuda , docker	37	5241	September 25, 2023
CUDA runtime on Jetson Orin AGX Jetson AGX Orin cuda	46	5630	September 1, 2023
PyTorch + CUDA11.4 on 6.0.8.1 DRIVE AGX Orin General driveos-cuda	11	1859	February 7, 2024
TensorFlow 1.11.0 wheel with JetPack 3.3 Jetson TX2	103	45348	November 13, 2019
Pytorch & torchversion compatible issue on L4T35.5.0 Jetson Orin Nano pytorch	20	235	November 7, 2024
Test nvidia-smi by nvidia docker Jetson TX2	2	2090	October 18, 2021
[REQUEST] build script for pytorch or up to date pytorh binary release supporting jetson boards running L4T35.6(ubuntu20.04) Jetson Orin Nano pytorch	14	53	December 28, 2024
CUDA 12.4 on Jetson Orin - CUDA driver/runtime API version mismatch Jetson Orin Nano cuda	7	219	September 10, 2024
Cannot get Torch met CUDA to work on Jetson Orin Jetson Orin NX pytorch	10	1576	February 26, 2024

Incompatible torch2.2+Cuda12.2 wheel with other python libraries for AGX Orin Jetpack6.0

Related topics