PyTorch for Jetson

I have confirmed that the problem has been resolved in the new wheel. Thank you.

Hi @724773044, please use v0.8.1 as the branch to clone (including the v prefix before the version number)

Hi @zdary - did you end up getting this installed and working with python3.7? any advice here would be super appreciated, thanks!

Hi @SamuelWei - did you successfully build pytorch with python3.7? Any advice or help here would be greatly appreciated! Thank you!

Iā€™ve got a Jetson Xavier NX running Jetpack 4.4. Release Version L4T32.4.3 which only works with PyTorch 1.6.0 or newer.

I need PyTorch 1.4.0 as Iā€™m working with Python 2.7. (because of ROS) but Iā€™m currently getting ImportError: libcudart.so.10.0 ... when installing PyTorch v1.4.0. because my Jetson has CUDA 10.2. Should I downgrade my Jetson to Jetpack 4.4. Developer/Jetpack 4.3. to get rid of the compatibility errors or how should I proceed to get PyTorch 1.4.0 working?

Hi @jorgemia, if you absolutely need PyTorch 1.4 and Python 2.7, then yes downgrade to a previous JetPack version. PyTorch versions older than PyTorch 1.5 donā€™t support the production release of cuDNN 8.0.

An alterative would be to upgrade to a newer version of ROS that uses Python 3 (like ROS Noetic, or ROS2 Eloquent/Foxy). You can find the Dockerfiles for these newer versions of ROS here:

1 Like

Hello, I use Jetson TX2, and I have install the new Jetpack 4.4.1. and try install PyTorch v1.7.0.

But I have error when I run the last build using $ python3 setup.py bdist_wheel, after 49% building.

Does PyTorch v1.7.0 no need to do Apply Patch on this

Thank you very much.

Hi @anom.tmu, yes to build PyTorch 1.7.0 from source, it needs the patch applied (pytorch-1.7-jetpack-4.4.1.patch)

However, if you just install the pre-built wheel, you donā€™t need to build it from source.

JetPack 4.4 (L4T R32.4.3) / JetPack 4.4.1 (L4T R32.4.4)

Hi, I am trying to install pytorch v1.7.0 with torchvision v0.8.1 on Jetson AGX Xavier.
Installed:

  1. JetPack4.4.1
  2. Cuda 10.2
  3. python 3.6.9

when I do

sudo python3 setup.py install

I get the following error:

/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/nvidia/torchvision/torchvision/csrc -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/TH -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /home/nvidia/torchvision/torchvision/csrc/cuda/PSROIAlign_cuda.cu -o build/temp.linux-aarch64-3.6/home/nvidia/torchvision/torchvision/csrc/cuda/PSROIAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_72,code=sm_72 -std=c++14
/usr/include/c++/8/utility(307): error: pack expansion does not make use of any argument packs

/usr/include/c++/8/utility(329): error: pack expansion does not make use of any argument packs

/usr/include/c++/8/utility(329): error: expected a ">"
          detected during instantiation of type "std::make_integer_sequence<std::size_t, _Num>"
(340): here

/usr/include/c++/8/utility(307): error: identifier "__integer_pack" is undefined
          detected during:
            instantiation of class "std::_Build_index_tuple<_Num> [with _Num=1UL]"
/usr/include/c++/8/functional(389): here
            instantiation of class "std::_Bind<_Functor (_Bound_args...)> [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(393): here

/usr/include/c++/8/utility(307): error: expected a ">"
          detected during:
            instantiation of class "std::_Build_index_tuple<_Num> [with _Num=1UL]"
/usr/include/c++/8/functional(389): here
            instantiation of class "std::_Bind<_Functor (_Bound_args...)> [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(393): here

/usr/include/c++/8/functional(482): error: no instance of function template "std::_Bind<_Functor (_Bound_args...)>::__call [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]" matches the argument list
            argument types are: (std::tuple<>, <error-type>)
            object type is: std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>
          detected during:
            instantiation of "_Result std::_Bind<_Functor (_Bound_args...)>::operator()(_Args &&...) [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>, _Args=<>, _Result=void]"
/usr/include/c++/8/bits/std_function.h(298): here
            instantiation of "void std::_Function_handler<void (_ArgTypes...), _Functor>::_M_invoke(const std::_Any_data &, _ArgTypes &&...) [with _Functor=std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>, _ArgTypes=<>]"
/usr/include/c++/8/bits/std_function.h(675): here
            instantiation of "std::function<_Res (_ArgTypes...)>::function(_Functor) [with _Res=void, _ArgTypes=<>, _Functor=std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>, <unnamed>=void, <unnamed>=void]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(385): here

6 errors detected in the compilation of "/tmp/tmpxft_0000522e_00000000-6_PSROIAlign_cuda.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

Can anyone please help.
Thankyou.

Hi @munshiblah, Iā€™m not familiar with this error, but have you tried installing the pre-built PyTorch 1.7 wheel without needing to build it from source yourself?

JetPack 4.4 (L4T R32.4.3) / JetPack 4.4.1 (L4T R32.4.4)

Regarding the compilation error, are you sure you cloned the PyTorch repo with --branch v1.7.0?

yes @dusty_nv, I have installed torch using the wheel itself. I can install and import torch on this easily but I am unable to install torchvision.

Hmm, I just tried again and am able to build it here - are you sure you cloned torchvision with --branch v0.8.1?

If you are still having problems, you can use the l4t-pytorch container which includes torchvision pre-built. The dockerfiles for them are found here: https://github.com/dusty-nv/jetson-containers

1 Like

$ git clone --branch v0.8.1 GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision # see below for version of torchvision to download
$ cd torchvision
$ export BUILD_VERSION=0.8.0 # where 0.x.0 is the torchvision version
** also tried below command
$ export BUILD_VERSION=0.8.1 # where 0.x.0 is the torchvision version
$ sudo python setup.py install # use python3 if installing for Python 3.6

always builds torchvision 0.8.0

torchvision.version
ā€˜0.8.0a0+45f960cā€™

required torchvision version is 0.8.1

Hello,I met the same problem.But i am using Pytorch 1.7 and Jetpack 4.4 on tx2.

Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type ā€œhelpā€, ā€œcopyrightā€, ā€œcreditsā€ or ā€œlicenseā€ for more information.

import cv2
import torch
Traceback (most recent call last):
File ā€œā€, line 1, in
File ā€œ/home/dji/.local/lib/python3.6/site-packages/torch/init.pyā€, line 190, in
from torch.C import *
ImportError: /home/dji/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: sgesdd

Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type ā€œhelpā€, ā€œcopyrightā€, ā€œcreditsā€ or ā€œlicenseā€ for more information.

import torch
Segmentation fault (core dumped)

(gdb) r -c ā€œimport torchā€
Starting program: /usr/bin/python3 -c ā€œimport torchā€
[Thread debugging using libthread_db enabled]
Using host libthread_db library ā€œ/lib/aarch64-linux-gnu/libthread_db.so.1ā€.

Program received signal SIGSEGV, Segmentation fault.
0x0000007f7360544c in GlobalError::PushToStack() () from /usr/lib/aarch64-linux-gnu/libapt-pkg.so.5.0

I uninstall the torch and install again but it doesnā€™t work.

@757419920 that sgess symbol appears to be from BLAS/LAPACK library, maybe a different version of that got installed? Other than that, Iā€™m not sure, sorry. Have you tried the l4t-pytorch container which already has it pre-installed?

Hi @kausainshah, it is reported that way here too, even though it is in fact torchvision 0.8.1. Iā€™m not sure why BUILD_VERSION is not taking effect. I double-checked and that environment variable is indeed used by torchvisionā€™s build script here:

https://github.com/pytorch/vision/blob/ca6fdd6d5de3643d79667bac5dd4e2c6ecaf21bc/setup.py#L45

If you find out the reason, please let us know.

I can use torch before yesterday,but it suddenly crash in yesterday.
Now I can use torch in another tx2 which has the same env.
I use ā€œsudo apt-get install libopenblas-baseā€ to install blas, it tells me ā€œlibopenblas-base is already the newest version (0.2.20+ds-4).ā€
Iā€™m not with tx2,connect it by ssh.So is there another solution?

@dusty_nv Surely

Hmm, my version of libopenblas-base is also 0.2.20+ds-4. I wonder if some pip package was upgraded that is causing the issue. Since you donā€™t have physical access to the TX2 to re-flash it, I would recommend using the l4t-pytorch container. Start by using one of the container tags which matches your L4T version (you can find this with cat /etc/nv_tegra_release

Then if that works, and if you still want PyTorch 1.7, you can rebuild the PyTorch 1.7 container against your current JetPack-L4T version using this repo: GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

Thank you for your suggestion. I checked the contrast version one by one with the other tx2 pip package which works properly, and then found the scipy which is not compatible with the torch. After I deleted the scipy, the ā€˜import torch ā€™can work normally.
Thank you very much.