PyTorch for Jetson

This was the error… Thank you so much.

Hi @rmj54, this link is working for me, are you able to try again? Perhaps it’s a network issue from your end?

Hi, this link does not work. 1.10 works but all the rest are down.

Tried this too: wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl

Same story. The download stops after connected.

Tried this too: wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl 1 -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl

Same story. The download stops after connected.

Can you try again, perhaps from your PC browser? I am able to now, so it may have been a temporary issue?

PyTorch 1.11 for JetPack 5.0 Developer Preview and Xavier/Orin has been posted:

PyTorch v1.11.0

I try install with 1.6 1.7 1.10.,After install ,I import torch,the system reply:
module ‘typing’ has no attribute ‘_SpecialForm’

Hi @1263032440, are you using Python 3.6? Can you try using the l4t-pytorch container to rule out an environment issue?

I am using jetpack 5.0 and my pytorch is 1.12.0a0+2c916ef.nv22.3
I am trying the yolov5 and I need torch vision I installed the main branch of torchvision but it gives me the incompatible error.

RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.

where can I find the torchvision compatible with torch 1.12.0a0+2c916ef.nv22.3

1 Like

yes,i use the 3.6.,and how can i get the container you described?

You can select one of the tags from here that’s compatible with your L4T version (you can check this with cat /etc/nv_tegra_release)

https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch/tags

And then the command to start the container is listed on this page: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch

Hi @user22290, can you try this torchvision commit? https://github.com/pytorch/vision/commit/e5a5f0be

1 Like

I am using the main branch. It has all the changes in this commit. The changes related to this commit is on the unittest/windows, while I am using linux.

Hi Do you have any other recommendation?
Was anyone able to use jetpack5 with pytorch or I need to downgrade?

Can you try the nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3 container and see if the error occurs in there?

Also, I built PyTorch 1.11 wheel for JetPack 5.0 that you could try:

PyTorch v1.11.0

Hi, I am using Jetpack 5.0 and I installed torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl. I run into this error when I try to import torch.

~$ python3 -c 'import torch; print(torch.cuda.is_available())'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ytt/.local/lib/python3.8/site-packages/torch/__init__.py", line 195, in <module>
    _load_global_deps()
  File "/home/ytt/.local/lib/python3.8/site-packages/torch/__init__.py", line 148, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory

I have checked that I have installed libmpi

~/.local/lib/python3.8/site-packages/torch$ locate libmpi
/etc/alternatives/libmpi++.so-aarch64-linux-gnu
/etc/alternatives/libmpi.so-aarch64-linux-gnu
/usr/lib/aarch64-linux-gnu/libmpi++.so
/usr/lib/aarch64-linux-gnu/libmpi.so
/usr/lib/aarch64-linux-gnu/libmpi.so.40
/usr/lib/aarch64-linux-gnu/libmpi.so.40.20.3
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/libmpi_java.so
/usr/lib/aarch64-linux-gnu/libmpi_java.so.40
/usr/lib/aarch64-linux-gnu/libmpi_java.so.40.20.0
/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so
/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so.40
/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so.40.20.2
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so.40
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so.40.20.0
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so.40
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so.40.21.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi.so.40.20.3
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_java.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_java.so.40.20.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_mpifh.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_mpifh.so.40.20.2
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempi_ignore_tkr.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempi_ignore_tkr.so.40.20.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempif08.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempif08.so.40.21.0
/usr/lib/python3/dist-packages/mpi4py/libmpi.pxd

But it seems that it cannot find this lib

~/.local/lib/python3.8/site-packages/torch$ ldd ~/.local/lib/python3.8/site-packages/torch/_C.cpython-36m-aarch64-linux-gnu.so
	linux-vdso.so.1 (0x0000ffff94028000)
	libtorch_python.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libtorch_python.so (0x0000ffff92ebd000)
	libshm.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libshm.so (0x0000ffff92ea5000)
	libnvToolsExt.so.1 => /usr/local/cuda-11.4/lib64/libnvToolsExt.so.1 (0x0000ffff92e8a000)
	libtorch.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libtorch.so (0x0000ffff92e78000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff92e24000)
	libtorch_cpu.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so (0x0000ffff8e7c1000)
	libtorch_cuda.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so (0x0000ffff793e6000)
	libc10_cuda.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libc10_cuda.so (0x0000ffff793a8000)
	libc10.so => /home/ytt/.local/lib/python3.8/site-packages/torch/lib/libc10.so (0x0000ffff7931b000)
	libcudnn.so.8 => /lib/aarch64-linux-gnu/libcudnn.so.8 (0x0000ffff792e1000)
	libcudart.so.10.2 => not found
	libmpi_cxx.so.20 => not found
	libmpi.so.20 => not found
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff792b1000)
	libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff790cc000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff790a8000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff78f35000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff93ff8000)
	librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000ffff78f1d000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff78e70000)
	libgomp.so.1 => /lib/aarch64-linux-gnu/libgomp.so.1 (0x0000ffff78e22000)
	libnuma.so.1 => /lib/aarch64-linux-gnu/libnuma.so.1 (0x0000ffff78e03000)
	libmpi_cxx.so.20 => not found
	libmpi.so.20 => not found
	libopenblas.so.0 => /lib/aarch64-linux-gnu/libopenblas.so.0 (0x0000ffff77f6b000)
	libcudart.so.10.2 => not found
	libcudart.so.10.2 => not found
	libcusparse.so.10 => not found
	libcurand.so.10 => /usr/local/cuda-11.4/lib64/libcurand.so.10 (0x0000ffff72b6f000)
	libcusolver.so.10 => not found
	libmpi_cxx.so.20 => not found
	libmpi.so.20 => not found
	libcufft.so.10 => /usr/local/cuda-11.4/lib64/libcufft.so.10 (0x0000ffff67c9c000)
	libcublas.so.10 => not found
	libcudart.so.10.2 => not found
	libgfortran.so.5 => /lib/aarch64-linux-gnu/libgfortran.so.5 (0x0000ffff67b21000)

I have tried export LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$LIBRARY_PATH , but it doesn’t work
Do you know how to solve this problem?

Hi, im trying to replicate results from this blog post and im unable to do so using above installation process. I was able to install torch 1.9.0 but im unable to install torchvision. I did go through all the above commands for torchvision v0.10.0 but when imported it seems empty since it does not have the version method nor the opt.nmv() function that im utilizing in my code. Any idea what went wrong ?

im running on jetpack 4.6.1-b110
python 3.6.9
jetson nano 2GB

Hmm interesting, because when I install this wheel, it does not have dependency on MPI:

ldd ~/.local/lib/python3.8/site-packages/torch/_C.cpython-38-aarch64-linux-gnu.so
        linux-vdso.so.1 (0x0000ffff9558e000)
        libtorch_python.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libtorch_python.so (0x0000ffff949c9000)
        libshm.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libshm.so (0x0000ffff949b1000)
        libtorch.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libtorch.so (0x0000ffff9499f000)
        libnvToolsExt.so.1 => /usr/local/cuda-11.4/lib64/libnvToolsExt.so.1 (0x0000ffff94984000)
        libtorch_cpu.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so (0x0000ffff90232000)
        libtorch_cuda.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so (0x0000ffff80e32000)
        libc10_cuda.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libc10_cuda.so (0x0000ffff80dd3000)
        libcudart.so.11.0 => /usr/local/cuda-11.4/lib64/libcudart.so.11.0 (0x0000ffff80d1e000)
        libcudnn.so.8 => /lib/aarch64-linux-gnu/libcudnn.so.8 (0x0000ffff80cc0000)
        libc10.so => /home/nvidia/.local/lib/python3.8/site-packages/torch/lib/libc10.so (0x0000ffff80c49000)
        libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff80a64000)
        libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff80a40000)
        libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff80a10000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8089d000)
        /lib/ld-linux-aarch64.so.1 (0x0000ffff9555e000)
        librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000ffff80885000)
        libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff80871000)
        libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff807c4000)
        libgomp.so.1 => /lib/aarch64-linux-gnu/libgomp.so.1 (0x0000ffff80776000)
        libnuma.so.1 => /lib/aarch64-linux-gnu/libnuma.so.1 (0x0000ffff80757000)
        libopenblas.so.0 => /lib/aarch64-linux-gnu/libopenblas.so.0 (0x0000ffff7f8bf000)
        libcusparse.so.11 => /usr/local/cuda-11.4/lib64/libcusparse.so.11 (0x0000ffff71cba000)
        libcurand.so.10 => /usr/local/cuda-11.4/lib64/libcurand.so.10 (0x0000ffff6c8be000)
        libcusolver.so.11 => /usr/local/cuda-11.4/lib64/libcusolver.so.11 (0x0000ffff5f83a000)
        libcufft.so.10 => /usr/local/cuda-11.4/lib64/libcufft.so.10 (0x0000ffff54969000)
        libcublas.so.11 => /usr/local/cuda-11.4/lib64/libcublas.so.11 (0x0000ffff4a917000)
        libgfortran.so.5 => /lib/aarch64-linux-gnu/libgfortran.so.5 (0x0000ffff4a79c000)
        libcublasLt.so.11 => /usr/local/cuda-11.4/lib64/libcublasLt.so.11 (0x0000ffff3343b000)

So I wonder where this issue came from. Can you try uninstalling torch, downloading the wheel again, and re-installing? If that doesn’t work, can you try the nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3 container or the PyTorch 1.11 wheel that I posted?

Did you install torchvision by building it from source, or from the wheel that was posted in that blog post? I’m not familiar with their wheel. If you built it from source, make sure your terminal is in a different directory than where you built it from.

Also, if you continue having issues, you could use the l4t-pytorch or jetson-inference container which already come with PyTorch and torchvision pre-installed.

I have used the jetpack with torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl and got error. you can fix it by installing:

It worked for me.

I specified if badly. I was not able to build it using their steps. Instead, I used the steps in the beginning of this thread made by you.
torch

$ wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl
$ sudo apt-get install python3-pip libopenblas-base libopenmpi-dev libomp-dev
$ pip3 install Cython
$ pip3 install numpy torch-1.8.0-cp36-cp36m-linux_aarch64.whl

torchvision:

$ sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev
$ git clone --branch GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision # see below for version of torchvision to download
$ cd torchvision
$ export BUILD_VERSION=0.x.0 # where 0.x.0 is the torchvision version
$ python3 setup.py install --user
$ cd …/ # attempting to load torchvision from build dir will result in import error
$ pip install ‘pillow<7’ # always needed for Python 2.7, not needed torchvision v0.5.0+ with Python 3.6

Builded torch just fine, but when i continue by building torchvision, i get stucked at "Allowing ninja to set a default number of workers… (overridable by setting the enviroment variable MAX_JOBS=N) " or it looks like its stuck. How long does it take to build it using your steps ?