PyTorch for Jetson

No worries @romangirin, glad you got it working!

How did you solve this problem please?

I have a small problem getting torch and torchvision to like each other whe nexperimenting with YOLO V5 from ultralytics

HaveJetson Nano, seems I have Jetpack 4.4 installed

wk@jn227:~$ dpkg-query --show nvidia-l4t-core
nvidia-l4t-core 32.4.4-20201016124427

Installed succesfully:

  • torch-1.9.0-cp36-cp36m-linux_aarch64.whl
  • torchvision 0.10.0
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
>>> torchvision.__version__
'0.10.0'
>>> import torch
>>> torch.__version__
'1.9.0'
>>> torch.cuda.is_available()
True
>>>

But when trying to run yolov5 demo I get error message they are not compatible!! But they should be if I understand correctly. One thing noted makes me worried, it seems to think I have a Tegra X1 but I just have a Nano :(

YOLOv5 🚀 2021-7-4 torch 1.9.0 CUDA:0 (NVIDIA Tegra X1, 3964.12109375MB)
wk@jn227:~/yolov5$ python3 demo.py
Using cache found in /home/wk/.cache/torch/hub/ultralytics_yolov5_master
Fusing layers...
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /media/nvidia/NVME/pytorch/pytorch-v1.9.0/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Model Summary: 224 layers, 7266973 parameters, 0 gradients
Adding AutoShape...
YOLOv5 🚀 2021-7-4 torch 1.9.0 CUDA:0 (NVIDIA Tegra X1, 3964.12109375MB)

Traceback (most recent call last):
  File "demo.py", line 10, in <module>
    results = model(img)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/wk/.cache/torch/hub/ultralytics_yolov5_master/models/common.py", line 281, in forward
    y = non_max_suppression(y, self.conf, iou_thres=self.iou, classes=self.classes, max_det=self.max_det)  # NMS
  File "/home/wk/.cache/torch/hub/ultralytics_yolov5_master/utils/general.py", line 547, in non_max_suppression
    i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
  File "/usr/local/lib/python3.6/dist-packages/torchvision/ops/boxes.py", line 34, in nms
    _assert_has_ops()
  File "/usr/local/lib/python3.6/dist-packages/torchvision/extension.py", line 63, in _assert_has_ops
    "Couldn't load custom C++ ops. This can happen if your PyTorch and "
RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.__version__ and your torchvision version with torchvision.__version__ and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.
wk@jn227:~/yolov5$

Hi there,

I never solved it, haha I just retrained all my models in tensorflow and rewrote my codebase, it works just as well without the RAM overhead (sure is uglier than pytorch though but oh well)

Not to worry - Nano is based on the TX1 chip (Tegra X1), so that is totally normal for it to report that.

OK, I just did a quick test trying to run torchvision.ops.nms() with PyTorch 1.9 / torchvision 0.10, and did not have any problems running it. I’m not sure if the issue is specific to your compilation of torchvision, or if it has something to do with the Ultralytics code running it.

Has the Ultralytics code been tested against PyTorch 1.9 / torchvision 0.10? Maybe it expects an older version?

This error pops up: ERROR: torch-1.8.0-cp36-cp36m-linux_aarch64.whl is not a supported wheel on this platform.

How can I fix it?

Hi @jmangasm, can you confirm you are trying to install it with pip3? (it is a Python 3.6 wheel)

What version of JetPack-L4T are you running? The torch-1.8 wheel should run on JetPack 4.4 through the latest JetPack 4.5.1.

edit - also refer to your other post here

I have installed successfully. Thanks.

Hi all, the PyTorch v1.9 wheel has been updated due to an issue found in this PyTorch #61110

Here is the updated wheel:

https://nvidia.box.com/shared/static/h1z9sw4bb1ybi0rm3tu8qdj8hs05ljbm.whl

Hi, I compiled like you but still have problems,
Here is my code:

subnet.qconfig = torch.quantization.default_qconfig
print(subnet.qconfig)
print(torch.backends.quantized.supported_engines)
subnet.qconfig = torch.quantization.get_default_qconfig('qnnpack')
torch.backends.quantized.engine = 'qnnpack'
torch.quantization.prepare(subnet, inplace=True)

This is error:

QConfig(activation=functools.partial(<class 'torch.quantization.observer.MinMaxObserver'>, reduce_range=True), weight=functools.partial(<class 'torch.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric))
['none']
terminate called after throwing an instance of 'c10::Error'
  what():  quantized engine QNNPACK is not supported
Exception raised from setQEngine at /media/nvidia/WD_NVME/PyTorch/JetPack_4.4.1/pytorch-v1.8.0/aten/src/ATen/Context.cpp:184 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xa0 (0x7f572cc290 in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb4 (0x7f572c90fc in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: at::Context::setQEngine(c10::QEngine) + 0x164 (0x7f6cfaec1c in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #3: THPModule_setQEngine(_object*, _object*) + 0x94 (0x7f71364a4c in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #5: python3() [0x52ba70]
frame #7: python3() [0x529978]
frame #9: python3() [0x5f4d34]
frame #11: python3() [0x5a7228]
frame #12: python3() [0x582308]
frame #16: python3() [0x529978]
frame #17: python3() [0x52b8f4]
frame #19: python3() [0x52b108]
frame #24: __libc_start_main + 0xe0 (0x7f8a127720 in /lib/aarch64-linux-gnu/libc.so.6)
frame #25: python3() [0x420e94]

Aborted (core dumped)

And my platform is Jetson XAVIER NX, pytorch version is 1.8.0
How can I fix it?

@gyt.971027
do you consider to use:

  1. a newer jetpack version? 4.4.1->4.5.1?
  2. newer pytorch version? 1.8.0 → 1.9.0
    any of the two?
    both?
    neither of the two?

Sorry, I am not sure whether it is my jetpack version:

# R32 (release), REVISION: 5.1, GCID: 27362550, BOARD: t186ref, EABI: aarch64, DATE: Wed May 19 18:16:00 UTC 2021

It looks my jetpack version is 5.1?
And I did not try use a newer pytorch version, if i use 1.9.0 pytorch version which version of torchvision should i use?

@gyt.971027
you would use torchvision 10, as per the reference table from github trch/vision repository

torch	torchvision	python
master / nightly	master / nightly	>=3.6
1.9.0	0.10.0	>=3.6

Thanks very much, @Andrey1984
And can I ask what’s the mean of the ‘Apply Patch’ step? How would I select after the ‘Download Pytorch sources step’? I know the question I asked may be stupid, but I do not know about it.
Thanks very much again!
Kind regards, Andrey.

Is this the URL you used to download the PyTorch 1.9 wheel?

https://nvidia.box.com/shared/static/h1z9sw4bb1ybi0rm3tu8qdj8hs05ljbm.whl

Hi @dusty_nv ,

Recently, I rebuilt the PyTorch package with MAGMA library in l4t-ml:r32.4.4-py3 docker container. The PyTorch package was built and installed sucessfully but I still facing some issues when I am using it.

These are the packages version after rebuilt.
PyTorch Version: 1.9.0
MAGMA Version: 2.5.3

ISSUE 1:


–>As the picture shown above, it takes almost 10+ minutes to parse the tensor from CPU to GPU (first initialization) for the rebuilt PyTorch


–>Comparing to the original PyTorch in l4t-ml:r32.4.4-py3, it only takes 5.43 seconds. How to solve the ultra long first initialization issue from CPU to GPU for the rebuilt PyTorch?

ISSUE 2:


–>When I am using torch.solve function, it pops out an error as above shown but it still returns an output for me. Does the error means anything in this case?

Dockerfile to rebuild PyTorch with MAGMA library from original l4t-ml:r32.4.4-py3

ARG BASE_IMAGE=nvcr.io/nvidia/l4t-ml:r32.4.4-py3
FROM ${BASE_IMAGE}

WORKDIR root

RUN apt-get update
RUN pip3 install --upgrade pip
RUN pip3 install scikit-build
RUN pip3 install cmake
RUN git clone https://github.com/matthewmax09/test_build && cd test_build && chmod +x step3_install_fastai2.sh && ./step3_install_fastai2.sh

Please also let me know if you have any question. Thank you!

Regards,
@ziimiin

@ziimiin
I saw the build script you presented.

You applied the @dusty_nv patch, which is a patch for “too many CUDA resources requested for launch”. It is not the NEON patch.

This is the patch you used.
https://gist.githubusercontent.com/dusty-nv/ce51796085178e1f38e3c6a1663a93a1/raw/fb2e0b6e89960fedd63ffc5a33e49e46dce5c987/pytorch-1.9-jetpack-4.5.1.patch

You will also need the NEON patch.

If you look at the v1.9.0 code for confirmation, you can see that the NEON patch has not been applied.

That is, you need to apply two patches.
https://gist.githubusercontent.com/dusty-nv/ce51796085178e1f38e3c6a1663a93a1/raw/fb2e0b6e89960fedd63ffc5a33e49e46dce5c987/pytorch-1.9-jetpack-4.5.1.patch
and

Thanks for your link! I used the wrong link.
Kind regards.

@gyt.971027

Exception raised from setQEngine at /media/nvidia/WD_NVME/PyTorch/JetPack_4.4.1/pytorch-v1.8.0/aten/src/ATen/Context.cpp:184 (most recent call first):

this path seems to cause the excerption as per your message abov

/media/nvidia/WD_NVME/PyTorch/JetPack_4.4.1/pytorch-v1.8.0/aten/src/ATen/Context.cpp:18

It is the new error after i use pytorch 1.9.0

['none']
terminate called after throwing an instance of 'c10::Error'
  what():  quantized engine QNNPACK is not supported
Exception raised from setQEngine at /media/nvidia/NVME/pytorch/pytorch-v1.9.0/aten/src/ATen/Context.cpp:181 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xa0 (0x7f4ddab300 in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb4 (0x7f4dda76b4 in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: at::Context::setQEngine(c10::QEngine) + 0x138 (0x7f64694940 in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #3: THPModule_setQEngine(_object*, _object*) + 0x94 (0x7f691f6364 in /home/yuantian/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #5: python3() [0x52ba70]
frame #7: python3() [0x529978]
frame #9: python3() [0x5f4d34]
frame #11: python3() [0x5a7228]
frame #12: python3() [0x582308]
frame #16: python3() [0x529978]
frame #17: python3() [0x52b8f4]
frame #19: python3() [0x52b108]
frame #24: __libc_start_main + 0xe0 (0x7f82136720 in /lib/aarch64-linux-gnu/libc.so.6)
frame #25: python3() [0x420e94]

Aborted (core dumped)

And I search the computer I do not have the file.
/media/nvidia/NVME/pytorch/pytorch-v1.9.0/aten/src/ATen/Context.cpp:181