Installing PyTorch v2.1.0 and Torchvision v0.16.1 on Jetson Orin Nano with JetPack 5.1.2

Hi there!
I am trying to install PyTorch v2.1.0 and Torchvision v0.16.1 on my Jetson Orin Nano Developer Kit running on JetPack 5.1.2, following the instrcutions from PyTorch for Jetson
I experience the following behavior that I do not really understand:

~$ pip3 install numpy torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (1.17.4)
Processing ./torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
Requirement already satisfied: typing-extensions in ./.local/lib/python3.8/site-packages (from torch==2.1.0a0+41361538.nv23.06) (4.9.0)
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 1.8.0
    Uninstalling torch-1.8.0:
      Successfully uninstalled torch-1.8.0
Successfully installed torch-1.8.0

Why does the command install torch-1.8.0 and not torch-2.1.0 as the wheel name would suggest?

Later in the process I experience the following error:

~/torchvision$ python3 setup.py install --user
Traceback (most recent call last):
  File "setup.py", line 9, in <module>
    import torch
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 195, in <module>
    _load_global_deps()
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 148, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory

Maybe this second error is due to the fact that torch-1.8.0 and not 2.1.0 is installed?
I checked the following:

$ find /usr/lib -name 'libmpi_cxx*'
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so

Do you have any thoughts on what is happening here?
Thanks!

Hi @spin, please try downloading and installing the wheel again with the URL for the PyTorch 2.1 wheel instead of the URL for the PyTorch 1.8 wheel.

Thanks for your answer @dusty_nv! Indeed I got the URL wrong.

However, I still face problems when I want to install Torchvision:

$ sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libopenblas-dev libavcodec-dev libavformat-dev libswscale-dev
$ git clone --branch v0.16.1 https://github.com/pytorch/vision torchvision   # see below for version of torchvision to download
$ cd torchvision
$ python3 setup.py install --user
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 174, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcufft.so.11: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 9, in <module>
    import torch
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 234, in <module>
    _load_global_deps()
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 195, in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
  File "/home/user/.local/lib/python3.8/site-packages/torch/__init__.py", line 160, in _preload_cuda_deps
    raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path ['/home/user/torchvision', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/home/user/.local/lib/python3.8/site-packages', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.8/dist-packages']

Now, the error does not involve libmpi_cxx.so.20 anymore but libcufft.so.11 and libcublas.so.*[0-9].
Do you have an idea how to fix this? Thanks!

@spin sorry, I gave you the wheel for JetPack 6, not JetPack 5. Try this one instead:

PyTorch v2.1.0

https://developer.download.nvidia.cn/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.