PyTorch for Jetson - version 1.7.0 now available

I met a trouble on installing torchvision.
I successfully installed Pytorch1.2, but when i installed torchvision0.4.0 according to the guide, I got the error: c10/cuda/CUDAGuard.h: the dir or file is not exist.
error:command ‘aarch64-linux-gnu-gcc’ failed with exit status 1

I add a path to setup.py in 121line,
include_dirs[extensions_dir,"/home/jetbot/archiconda3/envs/fasterrcnn/lib/python3.6/site-packages/torch/include"]
The error “could not found the file or dir” did not occur but the error “arrch64-linux-gnu-gcc failed with exit status 1” still exist.

Hi @dusty_nv, I tried compiling 1.7.0 from source but bumps into

In file included from ../aten/src/ATen/cpu/vec256/intrinsics.h:24,
                 from ../aten/src/ATen/cpu/FlushDenormal.cpp:3:
../aten/src/ATen/cpu/vec256/missing_vld1_neon.h:449:1: error: redefinition of ‘void vst1q_p64_x2(poly64_t*, poly64x2x2_t)’
  449 | vst1q_p64_x2 (poly64_t * __a, poly64x2x2_t val)
      | ^~~~~~~~~~~~
In file included from ../aten/src/ATen/cpu/vec256/intrinsics.h:22,
                 from ../aten/src/ATen/cpu/FlushDenormal.cpp:3:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:28217:1: note: ‘void vst1q_p64_x2(poly64_t*, poly64x2x2_t)’ previously defined here
28217 | vst1q_p64_x2 (poly64_t * __a, poly64x2x2_t val)
      | ^~~~~~~~~~~~

I have python 3.8 since I updates with do-release-upgrade to 20.04 and can’t get python3.6.
Managed to compile 1.6.0 without problem with the current instructions. I have set my /usr/bin/gcc as a symlink to gcc-7 to make it work since gcc-9 is not compatible with cuda.

Do you have any suggestions on how to fix this error?

Hi @787585485, are you using conda? I have not tried it before in andaconda environment. Does it work with a normal install?

Hmm I’m not sure, sorry. I haven’t built for Python 3.8 myself. If you require 1.7.0, you might want to post an issue on the PyTorch GitHub since as you described, it seems this issue seems to have popped up in 1.7 but is find in 1.6?

After hours of compilation and tests, I can tell that emptying the file did the job

echo "//must be empty" | tee ./aten/src/ATen/cpu/vec256/missing_vld1_neon.h

Pytorch 1.7.0 now compile on a Jetson with Ubuntu20.04 since the VLD are not missing on 20.04

Hello,

cat /etc/nv_tegra_release
'# R32 (release), REVISION: 4.2, GCID: 20074772, BOARD: t210ref, EABI: aarch64, DATE: Thu Apr 9 01:22:12 UTC 2020

It’s revision 4.2, but I can’t install pytorch1.5.0.
Even if you explicitly specify whl version 1.5.0 and install it, version 1.7.0 will be installed.
Please confirm.

To install version 1.5.0, is the only way to build from source?

Thank you.

You shouldn’t need to build from source to install 1.5.0. Can you run these commands first?

pip3 uninstall torch
sudo pip3 uninstall torch
python3 -c "import torch; print(torch.__version__)"   # this should fail to import

Then, re-install PyTorch 1.5 wheel:

wget https://nvidia.box.com/shared/static/3ibazbiwtkl181n95n9em3wtrca7tdzp.whl -O torch-1.5.0-cp36-cp36m-linux_aarch64.whl
pip3 install torch-1.5.0-cp36-cp36m-linux_aarch64.whl
python3 -c "import torch; print(torch.__version__)"
1 Like

Hello, @dusty_nv

👍

Thank you.

Hello,

sudo pip3 install torch
The directory ‘/home//.cache/pip/http’ or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo’s -H flag.
The directory ‘/home//.cache/pip’ or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo’s -H flag.
Collecting torch
Downloading https://files.pythonhosted.org/packages/f8/02/880b468bd382dc79896eaecbeb8ce95e9c4b99a24902874a2cef0b562cea/torch-0.1.2.post2.tar.gz (128kB)
100% |████████████████████████████████| 133kB 1.1MB/s
Complete output from command python setup.py egg_info:
running egg_info
creating pip-egg-info/torch.egg-info
writing pip-egg-info/torch.egg-info/PKG-INFO
writing dependency_links to pip-egg-info/torch.egg-info/dependency_links.txt
writing requirements to pip-egg-info/torch.egg-info/requires.txt
writing top-level names to pip-egg-info/torch.egg-info/top_level.txt
writing manifest file ‘pip-egg-info/torch.egg-info/SOURCES.txt’
error: package directory ‘torch/cuda’ does not exist

What am I supposed to do?

Thank you.

I’m sorry, that was a typo in my post above - I meant sudo pip3 uninstall torch (not install)

1 Like

Hi,

When I try to installed torchvision and then execute the line:
sudo python setup.py install

I get the following error:
File “setup.py”, line 6, in
from setuptools import setup, find_packages
ImportError: No module named setuptools

BR
Afshin

Hi @afsamani, which version of PyTorch do you have installed? PyTorch has discontinued support for Python 2.x since PyTorch 1.5. So perhaps you meant to run sudo python3 setup.py install?

If you indeed intended to use Python 2.7 and are still getting that error, can you try running sudo pip install setuptools?

Hi,

How can I use matplotlib with l4t-ml? Looks like it is not included.

Thanks!

Hi @Subframe, you can run this command inside the container to install matplotlib package:

apt-get update && apt-get install python3-matplotlib

Hi @dusty_nv, thanks!

Hi @dusty_nv I met following error when #include <THC/THCAtomics.cuh>, do you have any advice?
/usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh: In member function ‘void AtomicAddIntegerImpl<T, 1>::operator()(T*, T)’: /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh:31:13: error: there are no arguments to ‘atomicCAS’ that depend on a template parameter, so a declaration of ‘atomicCAS’ must be available [-fpermissive]
old = atomicCAS(address_as_ui, assumed, newval);
/usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh: In member function ‘void AtomicAddIntegerImpl<T, 2>::operator()(T*, T)’: /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh:54:13: error: there are no arguments to ‘atomicCAS’ that depend on a template parameter, so a declaration of ‘atomicCAS’ must be available [-fpermissive] old = atomicCAS(address_as_ui, assumed, newval); ^~~~~~~~~ /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh: In member function ‘void AtomicAddIntegerImpl<T, 4>::operator()(T*, T)’: /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh:70:13: error: there are no arguments to ‘atomicCAS’ that depend on a template parameter, so a declaration of ‘atomicCAS’ must be available [-fpermissive] old = atomicCAS(address_as_ui, assumed, newval); ^~~~~~~~~ /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh: In member function ‘void AtomicAddIntegerImpl<T, 8>::operator()(T*, T)’: /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh:86:13: error: there are no arguments to ‘atomicCAS’ that depend on a template parameter, so a declaration of ‘atomicCAS’ must be available [-fpermissive] old = atomicCAS(address_as_ui, assumed, newval); ^~~~~~~~~ /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh: In function ‘void gpuAtomicAdd(int32_t*, int32_t)’: /usr/local/lib/python3.6/dist-packages/torch/include/THC/THCAtomics.cuh:104:3: error: ‘atomicAdd’ was not declared in this scope atomicAdd(address, val);

Hi @dmu_hbw, I am unfamiliar with the error - does this happen when you are trying to build PyTorch, or your own application? You might want to post an issue on the PyTorch GitHub about it.