PyTorch for Jetson

themozel · June 17, 2020, 11:48am

Thank you
I think I need to reflash the device

ruelj2 · June 18, 2020, 9:06pm

Thanks for this piece of code! Really appreciated. But my concern is about building it in a docker image. For now it seems almost impossible since MAGMA depends on cublas and it is not available in Nvidia’s l4t-base image. Also Nvidia don’t share cuda binaries for arm64 installation.
@dusty_nv do you see a solution?

Thanks a lot!

dusty_nv · June 19, 2020, 2:40pm

Hi @ruelj2, I see cublas in the l4t-base image:

$  sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-base:r32.4.2
# ls -ll /usr/lib/aarch64-linux-gnu/libcublas*
lrwxrwxrwx 1 root root       15 Jun 19 14:38 /usr/lib/aarch64-linux-gnu/libcublas.so -> libcublas.so.10
lrwxrwxrwx 1 root root       22 Jun 19 14:38 /usr/lib/aarch64-linux-gnu/libcublas.so.10 -> libcublas.so.10.2.2.89
-rw-r--r-- 1 root root 80530928 Oct 29  2019 /usr/lib/aarch64-linux-gnu/libcublas.so.10.2.2.89
lrwxrwxrwx 1 root root       17 Jun 19 14:38 /usr/lib/aarch64-linux-gnu/libcublasLt.so -> libcublasLt.so.10
lrwxrwxrwx 1 root root       24 Jun 19 14:38 /usr/lib/aarch64-linux-gnu/libcublasLt.so.10 -> libcublasLt.so.10.2.2.89
-rw-r--r-- 1 root root 33235064 Oct 29  2019 /usr/lib/aarch64-linux-gnu/libcublasLt.so.10.2.2.89

Are you running it with --runtime nvidia? To use it during docker build operations, you should set the default-runtime to nvidia: https://github.com/dusty-nv/jetson-containers#docker-default-runtime

ruelj2 · June 19, 2020, 3:29pm

Thank you very much for this precious information. Although the --runtime nvidia is not possible to specifiy during build stage, it is possible to “build my app within the container manually and commit the resulting image” using docker run -it --rm --runtime nvidia - ref.

dusty_nv · June 19, 2020, 3:32pm

To get the nvidia runtime during build stage, you can set the default-runtime to nvidia in your docker daemon configuration, as shown here: https://github.com/dusty-nv/jetson-containers#docker-default-runtime

hyunjaecho1213 · June 20, 2020, 12:33pm

@dusty_nv
I am having difficulty installing Pytorch with the following error:

Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/derek/.local/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

I realize that some people had the same error above so I installed it using the following commands but still the same error persists.

wget https://nvidia.box.com/shared/static/c3d7vm4gcs9m728j6o5vjay2jdedqb55.whl
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
pip3 install numpy torch-1.4.0-cp36-cp36m-linux_aarch64.whl

I used a SD card to flash image.

Ubuntu 18.04
CUDA Version 10.2.89
Not sure what Jetpack version but SD card image was downloaded today from https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit#write on a Mac host.

mura.ryo03302 · June 21, 2020, 5:54am

Can’t install pytorch on jetson xavier nx. The same problem occurs.
import error: from torch._C import *

qczone · June 22, 2020, 1:36pm

I notice that PyTorch has been updated to 1.5.1, if the PyTorch version of the Jetson Nano has been updated? If updated, can you give me a link？Thanks ^_^

qczone · June 22, 2020, 1:55pm

Installing PyTorch v1.5.0 can solve this problem, because the cuda version in the new image is 10.2.

dusty_nv · June 22, 2020, 3:06pm

In the past I haven’t build the *.1 minor releases because of time/support constraints, but I will pick up PyTorch 1.6 when it’s released. You can try building 1.5.1 from source though.

dusty_nv · June 22, 2020, 3:07pm

Hi @mura.ryo03302, which version of PyTorch did you install and which link did you use to download the wheel?

mura.ryo03302 · June 22, 2020, 3:43pm

I installed pytorch1.4 for JetPack 4.4 DP
Python 3.6 - torch-1.4.0-cp36-cp36m-linux_aarch64.whl

dusty_nv · June 22, 2020, 4:44pm

Hmm. Does it have any other error text, or does it only say import error: from torch._C import *

Also, you are trying to import this from a python3 environment, correct?

If you run the following command from a terminal, does it show any broken links or libraries it can’t find?

$ ldd ~/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-aarch64-linux-gnu.so
        linux-vdso.so.1 (0x0000007f83729000)
        libtorch_python.so => /home/nvidia/.local/lib/python3.6/site-packages/torch/./lib/libtorch_python.so (0x0000007f82983000)
        libshm.so => /home/nvidia/.local/lib/python3.6/site-packages/torch/./lib/libshm.so (0x0000007f82969000)
        libnvToolsExt.so.1 => /usr/local/cuda-10.2/lib64/libnvToolsExt.so.1 (0x0000007f82950000)
        libtorch.so => /home/nvidia/.local/lib/python3.6/site-packages/torch/./lib/libtorch.so (0x0000007f58c5f000)
        libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007f58c0d000)
        libc10_cuda.so => /home/nvidia/.local/lib/python3.6/site-packages/torch/./lib/libc10_cuda.so (0x0000007f58bd0000)
        libc10.so => /home/nvidia/.local/lib/python3.6/site-packages/torch/./lib/libc10.so (0x0000007f58b7b000)
        libcudart.so.10.2 => /usr/local/cuda-10.2/lib64/libcudart.so.10.2 (0x0000007f58b07000)
        libmpi_cxx.so.20 => /usr/lib/aarch64-linux-gnu/libmpi_cxx.so.20 (0x0000007f58adc000)
        libmpi.so.20 => /usr/lib/aarch64-linux-gnu/libmpi.so.20 (0x0000007f589eb000)
        libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000007f58858000)
        libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007f58834000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f586db000)
        /lib/ld-linux-aarch64.so.1 (0x0000007f836fe000)
        librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000007f586c4000)
        libcufft.so.10 => /usr/local/cuda-10.2/lib64/libcufft.so.10 (0x0000007f4c631000)
        libcurand.so.10 => /usr/local/cuda-10.2/lib64/libcurand.so.10 (0x0000007f48519000)
        libcublas.so.10 => /usr/lib/aarch64-linux-gnu/libcublas.so.10 (0x0000007f4383b000)
        libcudnn.so.8 => /usr/lib/aarch64-linux-gnu/libcudnn.so.8 (0x0000007f43805000)
        libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007f4374b000)
        libgomp.so.1 => /usr/lib/aarch64-linux-gnu/libgomp.so.1 (0x0000007f4370e000)
        libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000007f436f9000)
        libnuma.so.1 => /usr/lib/aarch64-linux-gnu/libnuma.so.1 (0x0000007f436db000)
        libopenblas.so.0 => /usr/lib/aarch64-linux-gnu/libopenblas.so.0 (0x0000007f43013000)
        libcusparse.so.10 => /usr/local/cuda-10.2/lib64/libcusparse.so.10 (0x0000007f3a960000)
        libopen-rte.so.20 => /usr/lib/aarch64-linux-gnu/libopen-rte.so.20 (0x0000007f3a8ce000)
        libopen-pal.so.20 => /usr/lib/aarch64-linux-gnu/libopen-pal.so.20 (0x0000007f3a81c000)
        libhwloc.so.5 => /usr/lib/aarch64-linux-gnu/libhwloc.so.5 (0x0000007f3a7d8000)
        libcublasLt.so.10 => /usr/lib/aarch64-linux-gnu/libcublasLt.so.10 (0x0000007f38812000)
        libgfortran.so.4 => /usr/lib/aarch64-linux-gnu/libgfortran.so.4 (0x0000007f3870e000)
        libutil.so.1 => /lib/aarch64-linux-gnu/libutil.so.1 (0x0000007f386fb000)
        libltdl.so.7 => /usr/lib/aarch64-linux-gnu/libltdl.so.7 (0x0000007f386e2000)

hyunjaecho1213 · June 23, 2020, 3:33am

Thanks! It worked

mura.ryo03302 · June 24, 2020, 7:42am

The errors and their execution environments are as follows

xavier@xavier-desktop:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_21:14:42_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
xavier@xavier-desktop:~$  python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xavier/.local/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
    from torch._C import *
ImportError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory

command and found that some links are broken. Please let me know how to solve this problem.

avier@xavier-desktop:~$ ldd ~/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-aarch64-linux-gnu.so
	linux-vdso.so.1 (0x0000007f943b2000)
	libgtk3-nocsd.so.0 => /usr/lib/aarch64-linux-gnu/libgtk3-nocsd.so.0 (0x0000007f94335000)
	libtorch_python.so => /home/xavier/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so (0x0000007f935cc000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000007f935b7000)
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007f9358b000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f93432000)
	/lib/ld-linux-aarch64.so.1 (0x0000007f94387000)
	libshm.so => /home/xavier/.local/lib/python3.6/site-packages/torch/lib/libshm.so (0x0000007f93418000)
	libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x0000007f933ff000)
	libtorch.so => /home/xavier/.local/lib/python3.6/site-packages/torch/lib/libtorch.so (0x0000007f6970e000)
	libc10_cuda.so => /home/xavier/.local/lib/python3.6/site-packages/torch/lib/libc10_cuda.so (0x0000007f696d1000)
	libc10.so => /home/xavier/.local/lib/python3.6/site-packages/torch/lib/libc10.so (0x0000007f6967c000)
	libcudart.so.10.2 => /usr/local/cuda/lib64/libcudart.so.10.2 (0x0000007f69608000)
	libmpi_cxx.so.20 => not found
	libmpi.so.20 => not found
	libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000007f69474000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007f69450000)
	librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000007f69439000)
	libcufft.so.10 => /usr/local/cuda/lib64/libcufft.so.10 (0x0000007f5d3a6000)
	libcurand.so.10 => /usr/local/cuda/lib64/libcurand.so.10 (0x0000007f5928e000)
	libcublas.so.10 => /usr/lib/aarch64-linux-gnu/libcublas.so.10 (0x0000007f545b0000)
	libcudnn.so.8 => /usr/lib/aarch64-linux-gnu/libcudnn.so.8 (0x0000007f5457a000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007f544c0000)
	libgomp.so.1 => /usr/lib/aarch64-linux-gnu/libgomp.so.1 (0x0000007f54483000)
	libnuma.so.1 => /usr/lib/aarch64-linux-gnu/libnuma.so.1 (0x0000007f54465000)
	libmpi_cxx.so.20 => not found
	libmpi.so.20 => not found
	libopenblas.so.0 => /usr/lib/aarch64-linux-gnu/libopenblas.so.0 (0x0000007f53d9d000)
	libcusparse.so.10 => /usr/local/cuda/lib64/libcusparse.so.10 (0x0000007f4b6ea000)
	libcublasLt.so.10 => /usr/lib/aarch64-linux-gnu/libcublasLt.so.10 (0x0000007f49724000)
	libgfortran.so.4 => /usr/lib/aarch64-linux-gnu/libgfortran.so.4 (0x0000007f49621000)

Andrey1984 · June 24, 2020, 10:36am

you may try searching the file;

find ~ -name libmpi*

otherwise

sudo apt install mlocate
sudo updatedb
locate libmpi

If the library is not presented in the system - then it will need to be installed; otherwise paths will need to be adjusted. e.g. somewhat like

export LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$LIBRARY_PATH

reference ImportError: libmpi_cxx.so.1:, the setup of LD_LIBRARY_PATH after install openmpi-bin (Ubuntu18.04) - Solved · Issue #3499 · microsoft/CNTK · GitHub
reference mpi.h: No such file or directory · Issue #32 · NVIDIA/nccl-tests · GitHub

dusty_nv · June 24, 2020, 4:24pm

Thanks Andrei, yes you should be able to install it with sudo apt-get install libopenmpi-dev

themozel · June 25, 2020, 6:51am

Hey everyone!
Is there a wheel of torch for python 3.7 for using on a Jetson Xavier AGX ?

dusty_nv · June 25, 2020, 3:57pm

Hi @themozel, I build the wheels for Python 3.6. However some others on this thread have been able to build PyTorch from source for Python 3.7. I think you need to run sudo apt-get install python3.7-dev first.

mura.ryo03302 · June 25, 2020, 4:53pm

Thanks Andrey1984, dusty_nv.
All resolved.
The problem was that there was no library, so I solved it by installing libopenmpi-dev.

Topic		Replies	Views
Pytorch support Jetson Nano	31	4819	October 18, 2021
Torchvision will not import into Python after jetson-inference build of PyTorch Jetson Nano jetson-inference , pytorch , python	15	2021	October 18, 2021
PyTorch Install problem (Solved) Jetson AGX Xavier	35	32177	October 18, 2021
Cant install Pytorch on JetsonNano P3450 Jetson Nano pytorch	21	2657	August 16, 2023
pytorch & torchvision installation issue Jetson Nano	17	15320	October 14, 2021
Jetson TX2: Pytorch install problem Jetson TX2	16	8105	October 18, 2021
Need help with install Pytorch on TX1 Jetson TX1 opencv	6	1210	October 18, 2021
Torch install issue Jetson Xavier NX pytorch	3	1601	October 18, 2021
PyTorch Install with Python3 Broken Jetson TX2	26	11015	October 18, 2021
Install Pytorch with cuda on Jetson Orin nano Devloper Kit Jetson Orin Nano cuda , pytorch	13	3796	July 30, 2024

PyTorch for Jetson

Related topics