PyTorch for Jetson

rodrigo.tannousb · December 18, 2019, 2:22pm

Hi, I’m new to Linux and AI and just bought a Nano to learn a little.
I’m trying to use a pretrained model, torchvision.models.detection.maskrcnn_resnet50_fpn, that requires torchvision >= 0.3, so I installed the 0.5 version from first page, code below, and when importing torchvision the version is still 0.2.2. So how can I use the version that I just installed? THX

$ sudo apt-get install libjpeg-dev zlib1g-dev
    $ git clone --branch <version> https://github.com/pytorch/vision torchvision   # see below for version of torchvision to download
    $ cd torchvision
    $ sudo python setup.py install
    $ cd ../  # attempting to load torchvision from build dir will result in import error

dusty_nv · December 18, 2019, 2:27pm

Was torchvision installed previously somehow? You might want to remove torchvision package from your system with pip, and remove the git repo you cloned, and try again. What was the ‘git clone’ command that you used?

rdejana · December 23, 2019, 4:14pm

Any plans to build for jetpack 4.3?

dusty_nv · December 23, 2019, 7:09pm

Hi rdejana, from my testing the same pip wheel builds work for JetPack 4.3 as well.

rdejana · December 23, 2019, 11:34pm

Thanks Dusty!

2294748540 · December 24, 2019, 12:27pm

Hi,I am just using TX1 for a little while, and I have installed pytorch, I can import it in python3. Recently, I installed torchvision, it seems like I installed it successfully, but when I import it, it has some Runtime Errors, like below.

nvidia@nvidia-desktop:~$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch as t
>>> import torchvision as tv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/__init__.py", line 3, in <module>
    from torchvision import models
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/__init__.py", line 12, in <module>
    from . import detection
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/__init__.py", line 1, in <module>
    from .faster_rcnn import *
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/faster_rcnn.py", line 13, in <module>
    from .rpn import AnchorGenerator, RPNHead, RegionProposalNetwork
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/rpn.py", line 11, in <module>
    from . import _utils as det_utils
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py", line 19, in <module>
    class BalancedPositiveNegativeSampler(object):
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1219, in script
    _compile_and_register_class(obj, _rcb, qualified_name)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1076, in _compile_and_register_class
    _jit_script_class_compile(qualified_name, ast, rcb)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/_recursive.py", line 222, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1226, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError: 
builtin cannot be used as a value:
at /usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py:14:56
def zeros_like(tensor, dtype):
    # type: (Tensor, int) -> Tensor
    return torch.zeros_like(tensor, dtype=dtype, layout=tensor.layout,
                                                        ~~~~~~~~~~~~~ <--- HERE
                            device=tensor.device, pin_memory=tensor.is_pinned())
'zeros_like' is being compiled since it was called from '__torch__.torchvision.models.detection._utils.BalancedPositiveNegativeSampler.__call__'
at /usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py:72:12

            # randomly select positive and negative examples
            perm1 = torch.randperm(positive.numel(), device=positive.device)[:num_pos]
            perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]

            pos_idx_per_image = positive[perm1]
            neg_idx_per_image = negative[perm2]

            # create binary mask from indices
            pos_idx_per_image_mask = zeros_like(
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
                matched_idxs_per_image, dtype=torch.uint8
            )
            neg_idx_per_image_mask = zeros_like(
                matched_idxs_per_image, dtype=torch.uint8
            )

            pos_idx_per_image_mask[pos_idx_per_image] = torch.tensor(1, dtype=torch.uint8)
            neg_idx_per_image_mask[neg_idx_per_image] = torch.tensor(1, dtype=torch.uint8)

How can I fix it? THX!

Anonym · December 24, 2019, 11:31pm

Hi all,

I ma trying to install Pytorch v1.3.0 on Jetson nano. I was able to run the wget command. However when I run the second command:

pip3 install numpy torch-1.3.0-cp36-cp36m-linux_aarch64.whl

I am facing the below error.

Command “/usr/bin/python3 -u -c “import setuptools, tokenize;file=‘/tmp/pip-build-80g4vkm7/numpy/setup.py’;f=getattr(tokenize, ‘open’, open)(file);code=f.read().replace(‘\r\n’, ‘\n’);f.close();exec(compile(code, file, ‘exec’))” install --record /tmp/pip-sqm1_slh-record/install-record.txt --single-version-externally-managed --compile --user --prefix=” failed with error code 1 in /tmp/pip-build-80g4vkm7/numpy/

Can anyone please help me on this?

Thanks in advance.

shahadarsh · December 30, 2019, 10:13pm

dusty_nv:

@dusty_nv ,
Hi, could you tell me how to install torchvision?
I cant install it by “pip3 install torchvision” cause it would collecting torch(from torchvision), and PyTorch does not currently provide packages for PyPI.

Please help me out. Thanks a lot

Hi buptwlr, run the commands below to install torchvision. It is installed from source:
$ git clone https://github.com/pytorch/vision
$ cd vision
$ sudo python setup.py install

I’m getting this error when installing torchvision from source. Should libnvToolsExt.so.1 be there as I don’t see it.

appuser@afd0747ef1dc:~/vision$ python3 setup.py install
Traceback (most recent call last):
  File "setup.py", line 14, in <module>
    import torch
  File "/home/appuser/.local/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
    from torch._C import *
ImportError: libnvToolsExt.so.1: cannot open shared object file: No such file or directory
appuser@afd0747ef1dc:~/vision$ sudo python3 setup.py install
Traceback (most recent call last):
  File "setup.py", line 14, in <module>
    import torch
  File "/home/appuser/.local/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
    from torch._C import *
ImportError: libnvToolsExt.so.1: cannot open shared object file: No such file or directory```

dusty_nv · January 2, 2020, 7:25pm

libnvToolsExt.so should have been installed by JetPack as part of CUDA toolkit - for more info, please see this post:

https://devtalk.nvidia.com/default/topic/1067246/jetson-tx2/how-can-i-install-the-pytorch-/post/5405773/#5405773

shahadarsh · January 3, 2020, 5:06pm

Hi,I am just using TX1 for a little while, and I have installed pytorch, I can import it in python3. Recently, I installed torchvision, it seems like I installed it successfully, but when I import it, it has some Runtime Errors, like below.

nvidia@nvidia-desktop:~$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch as t
>>> import torchvision as tv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/__init__.py", line 3, in <module>
    from torchvision import models
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/__init__.py", line 12, in <module>
    from . import detection
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/__init__.py", line 1, in <module>
    from .faster_rcnn import *
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/faster_rcnn.py", line 13, in <module>
    from .rpn import AnchorGenerator, RPNHead, RegionProposalNetwork
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/rpn.py", line 11, in <module>
    from . import _utils as det_utils
  File "/usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py", line 19, in <module>
    class BalancedPositiveNegativeSampler(object):
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1219, in script
    _compile_and_register_class(obj, _rcb, qualified_name)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1076, in _compile_and_register_class
    _jit_script_class_compile(qualified_name, ast, rcb)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/_recursive.py", line 222, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 1226, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError: 
builtin cannot be used as a value:
at /usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py:14:56
def zeros_like(tensor, dtype):
    # type: (Tensor, int) -> Tensor
    return torch.zeros_like(tensor, dtype=dtype, layout=tensor.layout,
                                                        ~~~~~~~~~~~~~ <--- HERE
                            device=tensor.device, pin_memory=tensor.is_pinned())
'zeros_like' is being compiled since it was called from '__torch__.torchvision.models.detection._utils.BalancedPositiveNegativeSampler.__call__'
at /usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+d2c763e-py3.6-linux-aarch64.egg/torchvision/models/detection/_utils.py:72:12

            # randomly select positive and negative examples
            perm1 = torch.randperm(positive.numel(), device=positive.device)[:num_pos]
            perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]

            pos_idx_per_image = positive[perm1]
            neg_idx_per_image = negative[perm2]

            # create binary mask from indices
            pos_idx_per_image_mask = zeros_like(
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
                matched_idxs_per_image, dtype=torch.uint8
            )
            neg_idx_per_image_mask = zeros_like(
                matched_idxs_per_image, dtype=torch.uint8
            )

            pos_idx_per_image_mask[pos_idx_per_image] = torch.tensor(1, dtype=torch.uint8)
            neg_idx_per_image_mask[neg_idx_per_image] = torch.tensor(1, dtype=torch.uint8)

Im getting this same error. Any ideas how to fix this?

lycaass · January 7, 2020, 9:24pm

I’m getting the same error when attempting to import torchvision as well. It looks like its a known issue and they are fixing it in PyTorch v1.4 – which doesn’t help us much.

https://github.com/pytorch/vision/issues/1675

Update:

In order to get torchvision to import with PyTorch v1.3, I ending installing the following:

torchvision 0.4.2
pillow 6.2.2

dusty_nv · January 8, 2020, 2:19pm

Thanks lycaass - I’ve updated the instructions in the main post to reflect this torchvision version to use for the time being.

shahadarsh · January 8, 2020, 3:42pm

Thanks for the answer. I wasn’t using --runtime nvidia when running the container so kept having the issue. Now with the runtime added I see libnvToolsExt.so there. Looks like I cant install torchvision during docker build time.

shahadarsh · January 8, 2020, 3:44pm

Yes I did the same thing to get it working. Just used Pillow 6.1. Will try with 6.2.2
pip3 install matplotlib Pillow==6.1
it clone -b “v0.4.2” GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision

syarudhi · January 12, 2020, 1:37pm

I had error in torchvision

I command bellow.
$ sudo apt-get install libjpeg-dev zlib1g-dev
$ git clone --branch v0.4.0 GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision # see below for version of torchvision to download
$ cd torchvision
$ sudo python setup.py install

Then I get bellow error.

Traceback (most recent call last):
File “setup.py”, line 6, in
from setuptools import setup, find_packages
ImportError: No module named setuptools

Please help me.

Jetson Nano
JetPack 4.2.2
Python 3.6.9

herman.jansen · January 12, 2020, 8:39pm

I think I’m on to something, but I don’t now how to solve it.

When building a docker image on the nano (based onnvcr.io/nvidia/l4t-base:r32.3.1) I also get the famous missing libnvToolsExt.so.1 error, when the build step gets to installing torchvision (python3 setup.py install). PATH and LD_LIBRARY_PATH ENV’s are set in the Dockerfile.

When I then run the last container that was running when the build crashed, and I execute the ‘python3 setup.py install’ from within that container torchvision gets build and I can import and use it in python.

I’m thinking the temporary container during the build process is not started with ‘–gpus all’, and when I run the container I do use this option, and so the temporary build container does not have access to the NVidia cuda libs, while the manually started container does.

Question now is how I can get the intermediate container during the build process also use the ‘–gpus all’ option.
Any ideas?

herman.jansen · January 13, 2020, 5:10pm

Found the solution for the missing libnvToolsExt.so.1 here:

https://github.com/NVIDIA/nvidia-docker/issues/1033

Don’t forget to restart the docker deamon after changing the deamon.json file (or reboot the complete device)

pariente.mnl · January 14, 2020, 10:06am

Hi, thanks for the wheel !

When I run pytorch on CPU, it only uses one core i.e i get 100% CPU usage instead of 400% when I run something like

import torch
a = torch.rand(100, 1000, 1000)
b = torch.rand(100, 1000, 1000)

while True:
    c = torch.bmm(a, b)

I also noticed the same behaviour with numpy.
Tensorflow uses the four cores though.

Any thoughts how to fix that?

Thanks in advance,
Manu

danie62192 · January 14, 2020, 10:44am

i have problem installing torch on jetson xavier agx
i have followed those steps:

wget https://nvidia.box.com/shared/static/phqe92v26cbhqjohwtvxorrwnmrnfx1o.whl -O torch-1.3.0-cp36-cp36m-linux_aarch64.whl
pip3 install numpy torch-1.3.0-cp36-cp36m-linux_aarch64.whl
sudo apt-get install libjpeg-dev zlib1g-dev
git clone --branch v0.4.2 GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision
cd torchvision
sudo python setup.py install
cd …/
cd pytorch
git clone --recursive --branch v1.3.0 GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration
export USE_NCCL=0
export USE_DISTRIBUTED=0
export TORCH_CUDA_ARCH_LIST=“5.3;6.2;7.2”
export PYTORCH_BUILD_VERSION=1.3.0
export PYTORCH_BUILD_NUMBER=1
sudo apt-get install python3-pip cmake
sudo pip3 install -U setuptools
sudo pip3 install -r requirements.txt
pip3 install scikit-build --user
pip3 install ninja --user
python3 setup.py bdist_wheel

after hours of compilation
i wanted to test some function of torch
so created 2 random tensors(A and B)
and then used ‘X,LU = torch.solve(A,B)’
but i receive an error about missing LAPACK

‘RunTimeError: .solve LAPACK library not found in compilation’

what am i doing wrong???

dusty_nv · January 14, 2020, 3:32pm

What happened after you installed torch-1.3.0-cp36-cp36m-linux_aarch64.whl with pip3? Did that run ok?

From this related issue, it would seem that you need to install OpenBLAS package or similar: https://github.com/torch/torch7/issues/174

Topic		Replies	Views
pytorch & torchvision installation issue Jetson Nano	17	15082	October 14, 2021
Cant install Pytorch on JetsonNano P3450 Jetson Nano pytorch	21	2505	August 16, 2023
PyTorch Install problem (Solved) Jetson AGX Xavier	35	31962	October 18, 2021
Issues with installing torchvision 0.13.0 Jetson AGX Orin pytorch	13	161	October 30, 2024
Pytorch support Jetson Nano	31	4552	October 18, 2021
Issues Installing pytorch through jetson inference install script Jetson Nano pytorch	7	1300	October 16, 2023
Pytorch installation on Jetson AGX Orin Jetson AGX Orin nvbugs , pytorch	8	740	July 1, 2024
Pytorch Lightning set up on Jetson Nano/Xavier NX Jetson Xavier NX pytorch	4	4578	October 18, 2021
Torchvision will not import into Python after jetson-inference build of PyTorch Jetson Nano jetson-inference , pytorch , python	15	1941	October 18, 2021
"torch.cuda" is not available inside the docker container which is running on Jetson Orin Jetson Orin Nano cuda , docker	37	6072	September 25, 2023

PyTorch for Jetson

Related topics