Iāve got a Jetson Xavier NX running Jetpack 4.4. Release Version L4T32.4.3 which only works with PyTorch 1.6.0 or newer.
I need PyTorch 1.4.0 as Iām working with Python 2.7. (because of ROS) but Iām currently getting ImportError: libcudart.so.10.0 ... when installing PyTorch v1.4.0. because my Jetson has CUDA 10.2. Should I downgrade my Jetson to Jetpack 4.4. Developer/Jetpack 4.3. to get rid of the compatibility errors or how should I proceed to get PyTorch 1.4.0 working?
Hi @jorgemia, if you absolutely need PyTorch 1.4 and Python 2.7, then yes downgrade to a previous JetPack version. PyTorch versions older than PyTorch 1.5 donāt support the production release of cuDNN 8.0.
An alterative would be to upgrade to a newer version of ROS that uses Python 3 (like ROS Noetic, or ROS2 Eloquent/Foxy). You can find the Dockerfiles for these newer versions of ROS here:
Hi, I am trying to install pytorch v1.7.0 with torchvision v0.8.1 on Jetson AGX Xavier.
Installed:
JetPack4.4.1
Cuda 10.2
python 3.6.9
when I do
sudo python3 setup.py install
I get the following error:
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/nvidia/torchvision/torchvision/csrc -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/TH -I/home/nvidia/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /home/nvidia/torchvision/torchvision/csrc/cuda/PSROIAlign_cuda.cu -o build/temp.linux-aarch64-3.6/home/nvidia/torchvision/torchvision/csrc/cuda/PSROIAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_72,code=sm_72 -std=c++14
/usr/include/c++/8/utility(307): error: pack expansion does not make use of any argument packs
/usr/include/c++/8/utility(329): error: pack expansion does not make use of any argument packs
/usr/include/c++/8/utility(329): error: expected a ">"
detected during instantiation of type "std::make_integer_sequence<std::size_t, _Num>"
(340): here
/usr/include/c++/8/utility(307): error: identifier "__integer_pack" is undefined
detected during:
instantiation of class "std::_Build_index_tuple<_Num> [with _Num=1UL]"
/usr/include/c++/8/functional(389): here
instantiation of class "std::_Bind<_Functor (_Bound_args...)> [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(393): here
/usr/include/c++/8/utility(307): error: expected a ">"
detected during:
instantiation of class "std::_Build_index_tuple<_Num> [with _Num=1UL]"
/usr/include/c++/8/functional(389): here
instantiation of class "std::_Bind<_Functor (_Bound_args...)> [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(393): here
/usr/include/c++/8/functional(482): error: no instance of function template "std::_Bind<_Functor (_Bound_args...)>::__call [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>]" matches the argument list
argument types are: (std::tuple<>, <error-type>)
object type is: std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>
detected during:
instantiation of "_Result std::_Bind<_Functor (_Bound_args...)>::operator()(_Args &&...) [with _Functor=lambda [](std::function<c10::IValue ()>)->void, _Bound_args=<std::function<c10::IValue ()>>, _Args=<>, _Result=void]"
/usr/include/c++/8/bits/std_function.h(298): here
instantiation of "void std::_Function_handler<void (_ArgTypes...), _Functor>::_M_invoke(const std::_Any_data &, _ArgTypes &&...) [with _Functor=std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>, _ArgTypes=<>]"
/usr/include/c++/8/bits/std_function.h(675): here
instantiation of "std::function<_Res (_ArgTypes...)>::function(_Functor) [with _Res=void, _ArgTypes=<>, _Functor=std::_Bind<lambda [](std::function<c10::IValue ()>)->void (std::function<c10::IValue ()>)>, <unnamed>=void, <unnamed>=void]"
/home/nvidia/.local/lib/python3.6/site-packages/torch/include/ATen/core/ivalue_inl.h(385): here
6 errors detected in the compilation of "/tmp/tmpxft_0000522e_00000000-6_PSROIAlign_cuda.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1
Hi @munshiblah, Iām not familiar with this error, but have you tried installing the pre-built PyTorch 1.7 wheel without needing to build it from source yourself?
$ git clone --branch v0.8.1 GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision # see below for version of torchvision to download
$ cd torchvision
$ export BUILD_VERSION=0.8.0 # where 0.x.0 is the torchvision version
** also tried below command
$ export BUILD_VERSION=0.8.1 # where 0.x.0 is the torchvision version
$ sudo python setup.py install # use python3 if installing for Python 3.6
Hello,I met the same problem.But i am using Pytorch 1.7 and Jetpack 4.4 on tx2.
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type āhelpā, ācopyrightā, ācreditsā or ālicenseā for more information.
import cv2
import torch
Traceback (most recent call last):
File āā, line 1, in
File ā/home/dji/.local/lib/python3.6/site-packages/torch/init.pyā, line 190, in
from torch.C import *
ImportError: /home/dji/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: sgesdd
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type āhelpā, ācopyrightā, ācreditsā or ālicenseā for more information.
import torch
Segmentation fault (core dumped)
(gdb) r -c āimport torchā
Starting program: /usr/bin/python3 -c āimport torchā
[Thread debugging using libthread_db enabled]
Using host libthread_db library ā/lib/aarch64-linux-gnu/libthread_db.so.1ā.
Program received signal SIGSEGV, Segmentation fault.
0x0000007f7360544c in GlobalError::PushToStack() () from /usr/lib/aarch64-linux-gnu/libapt-pkg.so.5.0
I uninstall the torch and install again but it doesnāt work.
@757419920 that sgess symbol appears to be from BLAS/LAPACK library, maybe a different version of that got installed? Other than that, Iām not sure, sorry. Have you tried the l4t-pytorch container which already has it pre-installed?
Hi @kausainshah, it is reported that way here too, even though it is in fact torchvision 0.8.1. Iām not sure why BUILD_VERSION is not taking effect. I double-checked and that environment variable is indeed used by torchvisionās build script here:
I can use torch before yesterday,but it suddenly crash in yesterday.
Now I can use torch in another tx2 which has the same env.
I use āsudo apt-get install libopenblas-baseā to install blas, it tells me ālibopenblas-base is already the newest version (0.2.20+ds-4).ā
Iām not with tx2,connect it by ssh.So is there another solution?
Hmm, my version of libopenblas-base is also 0.2.20+ds-4. I wonder if some pip package was upgraded that is causing the issue. Since you donāt have physical access to the TX2 to re-flash it, I would recommend using the l4t-pytorch container. Start by using one of the container tags which matches your L4T version (you can find this with cat /etc/nv_tegra_release
Thank you for your suggestion. I checked the contrast version one by one with the other tx2 pip package which works properly, and then found the scipy which is not compatible with the torch. After I deleted the scipy, the āimport torch ācan work normally.
Thank you very much.