PyTorch for Jetson

1127733673 · May 10, 2022, 1:08am

hi,When I want to install torchvision on jetson nano, it keeps stuck at writing top-level names to torchvision.egg-info/top_level.txt, nothing else changes, what can I do to make it install faster?

dusty_nv · May 10, 2022, 7:03pm

Are you on Nano? Perhaps it is low on memory and you want to limit the number of compiling jobs with export MAX_JOBS=1 beforehand (or perhaps set it to 2). You could also disable ZRAM and mount SWAP: https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-transfer-learning.md#mounting-swap

Zelda · May 11, 2022, 5:39am

ok, that helped a little bit, but ends up with error.

jetson@jetson:~/torchvision$ python3 setup.py install --user
/home/jetson/.local/lib/python3.6/site-packages/pkg_resources/init.py:119: P kgResourcesDeprecationWarning: 0.18ubuntu0.18.04.1 is an invalid version and will not be supported in a future release
PkgResourcesDeprecationWarning,
Building wheel torchvision-0.10.0
PNG found: False
Running build on conda-build: False
Running build on conda: False
JPEG found: True
Building torchvision with JPEG image support
NVJPEG found: False
FFmpeg found: True
ffmpeg include path: [‘/usr/include’, ‘/usr/include/aarch64-linux-gnu’]
ffmpeg library_dir: [‘/usr/lib’, ‘/usr/lib/aarch64-linux-gnu’]
running install
/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/install.py:37 : SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
setuptools.SetuptoolsDeprecationWarning,
/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/easy_install.py:159: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
EasyInstallDeprecationWarning,
/home/jetson/.local/lib/python3.6/site-packages/pkg_resources/init.py:119: P kgResourcesDeprecationWarning: 0.18ubuntu0.18.04.1 is an invalid version and will not be supported in a future release
PkgResourcesDeprecationWarning,
running bdist_egg
running egg_info
writing torchvision.egg-info/PKG-INFO
writing dependency_links to torchvision.egg-info/dependency_links.txt
writing requirements to torchvision.egg-info/requires.txt
writing top-level names to torchvision.egg-info/top_level.txt
reading manifest file ‘torchvision.egg-info/SOURCES.txt’
reading manifest template ‘MANIFEST.in’
warning: no previously-included files matching ‘pycache’ found under directory ‘’
warning: no previously-included files matching '.py[co]’ found under directory’’
adding license file ‘LICENSE’
writing manifest file ‘torchvision.egg-info/SOURCES.txt’
installing library code to build/bdist.linux-aarch64/egg
running install_lib
running build_py
copying torchvision/version.py → build/lib.linux-aarch64-3.6/torchvision
running build_ext
building ‘torchvision._C’ extension
Emitting ninja build file /home/jetson/torchvision/build/temp.linux-aarch64-3.6/build.ninja…
Compiling objects…
Using envvar MAX_JOBS (1) as the number of workers…
[1/28] c++ -MMD -MF /home/jetson/torchvision/build/temp.linux-aarch64-3.6/home/j etson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.o.d -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=for mat-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DWITH_CUDA -I/home/jetson/torchvision/torchvision/csrc -I/home/jetson/.local/lib/python3.6/site-packages/torch/include -I/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/jetson/.local/lib/python3.6/site-packages/torch/include/TH -I/home/jetson/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c -c /home/jetson/torchvision/torc hvision/csrc/ops/autograd/deform_conv2d_kernel.cpp -o /home/jetson/torchvision/b uild/temp.linux-aarch64-3.6/home/jetson/torchvision/torchvision/csrc/ops/autogra d/deform_conv2d_kernel.o -DTORCH_API_INCLUDE_EXTENSION_H ‘-DPYBIND11_COMPILER_TY PE=“_gcc”’ ‘-DPYBIND11_STDLIB=“_libstdcpp”’ ‘-DPYBIND11_BUILD_ABI=“_cxxabi1011”’ -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
FAILED: /home/jetson/torchvision/build/temp.linux-aarch64-3.6/home/jetson/torchv ision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.o
c++ -MMD -MF /home/jetson/torchvision/build/temp.linux-aarch64-3.6/home/jetson/t orchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.o.d -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DWITH_CUDA -I/home/jetson/torchvision/torchvision/csrc -I/home/jetson/.local/lib/python3.6/site-packages/torch/include -I/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/a pi/include -I/home/jetson/.local/lib/python3.6/site-packages/torch/include/TH -I /home/jetson/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c -c /home/jetson/torchvision/torchvision /csrc/ops/autograd/deform_conv2d_kernel.cpp -o /home/jetson/torchvision/build/temp.linux-aarch64-3.6/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.o -DTORCH_API_INCLUDE_EXTENSION_H ‘-DPYBIND11_COMPILER_TYPE=“_gcc”’ ‘-DPYBIND11_STDLIB=“_libstdcpp”’ ‘-DPYBIND11_BUILD_ABI=“_cxxabi1011”’ -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp: In static member function ‘static torch::autograd::variable_list vision::ops::{ anonymous}::DeformConv2dFunction::forward(torch::autograd::AutogradContext, const Variable&, const Variable&, const Variable&, const Variable&, const Variable&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, bool)’ :
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp:30:9: error: ‘AutoDispatchBelowADInplaceOrView’ is not a member of ‘at’
at::AutoDispatchBelowADInplaceOrView g;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp: In static member function ‘static torch::autograd::variable_list vision::ops::{ anonymous}::DeformConv2dBackwardFunction::forward(torch::autograd::AutogradContext*, const Variable&, const Variable&, const Variable&, const Variable&, const Variable&, const Variable&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, bool)’:
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp:145:9: error: ‘AutoDispatchBelowADInplaceOrView’ is not a member of ‘at’
at::AutoDispatchBelowADInplaceOrView g;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/jetson/.local/lib/python3.6/site-packages/torch/incl ude/torch/csrc/api/include/torch/autograd.h:4:0,
from /home/jetson/torchvision/torchvision/csrc/ops/autograd/def orm_conv2d_kernel.cpp:3:
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h: In instantiation of ‘torch::autograd::variable_list torch:: autograd::CppNode::apply(torch::autograd::variable_list&&) [with T = vision::ops::{anonymous}::DeformConv2dBackwardFunction; torch::autograd::variable_list = std::vectorat::Tensor]’:
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp:266:1: required from here
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:279:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (num_outputs > num_forward_inputs) {
^~~~~~~~
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:290:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (num_outputs != num_forward_inputs) {
^~~~~~~~~
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:300:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < num_outputs; ++i) {
^~~~~~~~~~~
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h: In instantiation of ‘torch::autograd::variable_list torch:: autograd::CppNode::apply(torch::autograd::variable_list&&) [with T = vision::ops::{anonymous}::DeformConv2dFunction; torch::autograd::variable_list = std::vectorat::Tensor]’:
/home/jetson/torchvision/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp:266:1: required from here
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:279:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (num_outputs > num_forward_inputs) {
^~~~~~~~
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:290:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (num_outputs != num_forward_inputs) {
^~~~~~~~~
/home/jetson/.local/lib/python3.6/site-packages/torch/include/torch/csrc/autogra d/custom_function.h:300:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < num_outputs; ++i) {
^~~~~~~~~~~
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File “/home/jetson/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py”, line 1673, in _run_ninja_build
env=env)
File “/usr/lib/python3.6/subprocess.py”, line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘ninja’, ‘-v’, ‘-j’, ‘1’]’ returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “setup.py”, line 488, in
‘clean’: clean,
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/init.py”, line 153, in setup
return distutils.core.setup(**attrs)
File “/usr/lib/python3.6/distutils/core.py”, line 148, in setup
dist.run_commands()
File “/usr/lib/python3.6/distutils/dist.py”, line 955, in run_commands
self.run_command(cmd)
File “/usr/lib/python3.6/distutils/dist.py”, line 974, in run_command
cmd_obj.run()
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/install.py”, line 74, in run
self.do_egg_install()
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/install.py”, line 116, in do_egg_install
self.run_command(‘bdist_egg’)
File “/usr/lib/python3.6/distutils/cmd.py”, line 313, in run_command
self.distribution.run_command(command)
File “/usr/lib/python3.6/distutils/dist.py”, line 974, in run_command
cmd_obj.run()
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/bdist_egg.py”, line 164, in run
cmd = self.call_command(‘install_lib’, warn_dir=0)
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/bdist_egg.py”, line 150, in call_command
self.run_command(cmdname)
File “/usr/lib/python3.6/distutils/cmd.py”, line 313, in run_command
self.distribution.run_command(command)
File “/usr/lib/python3.6/distutils/dist.py”, line 974, in run_command
cmd_obj.run()
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/install_lib.py”, line 11, in run
self.build()
File “/usr/lib/python3.6/distutils/command/install_lib.py”, line 109, in build
self.run_command(‘build_ext’)
File “/usr/lib/python3.6/distutils/cmd.py”, line 313, in run_command
self.distribution.run_command(command)
File “/usr/lib/python3.6/distutils/dist.py”, line 974, in run_command
cmd_obj.run()
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/build_ext.py”, line 79, in run
_build_ext.run(self)
File “/home/jetson/.local/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py”, line 186, in run
_build_ext.build_ext.run(self)
File “/usr/lib/python3.6/distutils/command/build_ext.py”, line 339, in run
self.build_extensions()
File “/home/jetson/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py”, line 708, in build_extensions
build_ext.build_extensions(self)
File “/home/jetson/.local/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py”, line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File “/usr/lib/python3.6/distutils/command/build_ext.py”, line 448, in build_extensions
self._build_extensions_serial()
File “/usr/lib/python3.6/distutils/command/build_ext.py”, line 473, in build extensions_serial
self.build_extension(ext)
File “/home/jetson/.local/lib/python3.6/site-packages/setuptools/command/build_ext.py”, line 202, in build_extension
_build_ext.build_extension(self, ext)
File “/usr/lib/python3.6/distutils/command/build_ext.py”, line 533, in build_extension
depends=ext.depends)
File “/home/jetson/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py”, line 538, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File “/home/jetson/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py”, line 1359, in _write_ninja_file_and_compile_objects
error_prefix=‘Error compiling objects for extension’)
File “/home/jetson/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py”, line 1683, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

tanelmina · May 11, 2022, 8:49am

Hi! I have exactly the same issue as user160662 on the 8th of May with JetPack 5.0-b114:

For some reason both wheels torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl and torch-1.11.0-cp38-cp38-linux_aarch64.whl say after installation that:

Successfully installed torch-1.8.0

…and then, if I run PyTorch I get this message:

OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory

And deleting and reinstalling the wheels will not change anything.

When I after installing JetPack (Jetson AGX Xavier) try run the container nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3 I get:

failed to register layer: Error processing tar file(exit status 1): write /usr/lib/aarch64-linux-gnu/libcudnn_ops_infer_static.a: no space left on device

Could you please give some clue of what I’m doing wrong here or which I may not do.

Thanks!

dusty_nv · May 11, 2022, 1:41pm

It would appear that the version of torchvision you are compiling isn’t compatible with the version of PyTorch you have installed. What’s your version of PyTorch and torchvision? Have you tried the container yet?

dusty_nv · May 11, 2022, 1:44pm

Have you done this to uninstall torch first:

pip3 uninstall torch
sudo pip3 uninstall torch

Then try re-downloading and re-installing the PyTorch 1.11 wheel. Unfortunately I’m not sure where this MPI error comes from.

Can you check your free disk space with df -h?

Zelda · May 11, 2022, 4:22pm

I did not try containers because quite frankly i dont know how to run python scripts in it nor how to install additional libraries.

The i have installed pytorch v1.8.0 and im trying to install torchvision version 0.9.0 according the the instructions at the beginning of this thread.

user22290 · May 11, 2022, 6:23pm

I have the same issue for jetson Orion. I installed the exact version you sent for me.
when I am running yolov5 I get this error:

ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu (from versions: none)
ERROR: No matching distribution found for onnxruntime-gpu
requirements: Command ‘pip install ‘onnxruntime-gpu’’ returned non-zero exit status 1.
/home/x1demo/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:55: UserWarning: Specified provider ‘CUDAExecutionProvider’ is not in available provider names.Available providers: ‘CPUExecutionProvider’
warnings.warn(“Specified provider ‘{}’ is not in available provider names.”
Forcing --batch-size 1 square inference shape(1,3,640,640) for non-PyTorch backends
val: Scanning ‘/media/x1demo/T7/coco/val2017.cache’ images and labels… 4952 fo
Class Images Labels P R mAP@.5 mAP@
Traceback (most recent call last):
File “val.py”, line 378, in
main(opt)
File “val.py”, line 351, in main
run(**vars(opt))
File “/home/x1demo/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context
return func(*args, **kwargs)
File “val.py”, line 217, in run
out = non_max_suppression(out, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls)
File “/media/x1demo/T7/yolov5-master/utils/general.py”, line 752, in non_max_suppression
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
File “/home/x1demo/.local/lib/python3.8/site-packages/torchvision/ops/boxes.py”, line 39, in nms
_assert_has_ops()
File “/home/x1demo/.local/lib/python3.8/site-packages/torchvision/extension.py”, line 33, in _assert_has_ops
raise RuntimeError(
RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.

Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import onnxruntime
print(onnxruntime.version)
1.11.1
import tensorrt
print(tensorrt.version)
8.4.0.9
import torch
torch.version
‘1.11.0’
import torchvision
/home/x1demo/.local/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
torchvision.version
‘0.12.0’

tanelmina · May 12, 2022, 8:35am

Thanks for replying!

First of all – yes, PyTorch was previously uninstalled before re-installation and also wheel was previously deleted and then re-downloaded. The problem remained

Before pulling the container the df -h showed:

Filesystem Size Used Avail Use% Mounted on
/dev/mmcblk0p1 28G 17G 9.5G 64% /
none 7.3G 0 7.3G 0% /dev
tmpfs 7.3G 36K 7.3G 1% /dev/shm
tmpfs 1.5G 19M 1.5G 2% /run
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 7.3G 0 7.3G 0% /sys/fs/cgroup
tmpfs 1.5G 16K 1.5G 1% /run/user/124
tmpfs 1.5G 28K 1.5G 1% /run/user/1000

During the downloading and extracting, only the /dev/mmcblk0p1 were used.
What can I fix here?
Very grateful for all the advice!

dusty_nv · May 12, 2022, 5:07pm

Hmm okay, well it appears that you have enough space. Were you able to pull the container okay, and only get the error when you try to run it? i.e. is sudo docker pull nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3 successful for you?

dusty_nv · May 12, 2022, 5:20pm

Can you check that you have the torchvision C extension on your system? This is where mine is:

/usr/local/lib/python3.6/dist-packages/torchvision-0.10.0a0+300a8a4-py3.6-linux-aarch64.egg/torchvision/_C.so

You should be able to find the base directory of yours with pip3 show torchvision

user22290 · May 12, 2022, 7:53pm

yes I have _C.so in ~/.local/lib/python3.8/site-packages/torchvision

tanelmina · May 13, 2022, 6:43am

Unfortunately not, the problem occurs already with pulling. The results are as below:

jetcat@ubuntu:~$ df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mmcblk0p1 28G 17G 9.2G 65% /

none 7.3G 0 7.3G 0% /dev

tmpfs 7.3G 36K 7.3G 1% /dev/shm

tmpfs 1.5G 19M 1.5G 2% /run

tmpfs 5.0M 4.0K 5.0M 1% /run/lock

tmpfs 7.3G 0 7.3G 0% /sys/fs/cgroup

tmpfs 1.5G 16K 1.5G 1% /run/user/124

jetcat@ubuntu:~$ sudo docker pull nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3

[sudo] password for jetcat:

r34.1.0-pth1.12-py3: Pulling from nvidia/l4t-pytorch

5a7855fb0d7a: Pull complete

759f91a93adc: Pull complete

4f223a58ba55: Pull complete

fd8c1688c597: Pull complete

3304a02fae9a: Pull complete

85d0f53def7b: Pull complete

84f2cc1e8031: Pull complete

926a53aec541: Pull complete

8eb4d88b906b: Pull complete

c47ce97e79d0: Pull complete

ac7fb52e2a53: Pull complete

ec3c8505a4e1: Pull complete

4ed77e2dd171: Pull complete

a304dfd3102d: Pull complete

94aacccf5a11: Pull complete

9b13e24364dc: Extracting [==================================================>] 1.798GB/1.798GB

36141b96492b: Download complete

9a821f5f8b98: Download complete

c7fb1921beda: Download complete

745ed95dc2a9: Download complete

14e5a1a4264c: Download complete

c813606ebca1: Download complete

ef6f8f2a2e75: Download complete

40bc15802e4b: Download complete

ba31e11f96e7: Download complete

238720b450a3: Download complete

47b03fb78718: Download complete

0b854a0e74f6: Download complete

7e72acf6aa32: Download complete

6ace47ae640e: Download complete

a17e2fcfb740: Download complete

d051fe0a8833: Download complete

a791c0161136: Download complete

2050e56b9b1d: Download complete

f323b3050cdd: Download complete

027faa2c2aa1: Download complete

c0839d8d16be: Download complete

failed to register layer: Error processing tar file(exit status 1): write /usr/lib/aarch64-linux-gnu/libcudnn_cnn_train_static.a: no space left on device

jetcat@ubuntu:~$

dusty_nv · May 13, 2022, 1:28pm

Hmm okay…my thought is that there isn’t enough space on your eMMC to have docker download + extract the container image. Are you able to mount other storage (i.e. on NVME or USB drive or similar)? You can then change the docker data root to it so that it’s stored on your mounted storage.

I’m still not sure why you get the MPI errors when I don’t, so if you keep having issues with this you might want to re-flash your device with a fresh copy of JetPack 5.0 before sinking a lot more time into it.

user22290 · May 13, 2022, 4:51pm

any update?

dusty_nv · May 16, 2022, 2:08pm

Unfortunately I’m unable to reproduce the issue from here - do you have a simple standalone script that triggers it that I could try?

My guess is that since you do have the torchvision _C.so library, it fails to load due to some unmet/unmatched dependencies and then throws that error.

user22290 · May 16, 2022, 9:23pm

The steps I followed are here
git clone GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
cd yolov5/
pip install -r requirements.txt

Download YOLOv5s.pt from Ultralytics:
https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt

python3 ./export.py --weights yolov5s.pt --include onnx engine --device 0

python3 val.py --weights yolov5s.onnx --data coco128.yaml --img 640

Then I get the error I reported earlier.

bruce823ad · May 16, 2022, 10:05pm

In the spirit of this discussion, I am working on converting a yolov5 model to trt using the Ultralytics export process mentioned by user22290. The goal is to run outside the container with no dependance on pytorch (GitHub - alxmamaev/jetson_yolov5_tensorrt: Docker image and wrapper for converting and inference YOLO-v5 model). I am doing the export within the l4t-ml R34.1.0 container. I have an AGX with 5.0EA. Unfortunately, when I run the engine outside the container, I get an engine mis-match–expecting library version 8.4.0.9 got 8.4.0.8. I will try using onnx output to build the tensorrt engine (still new to this), but was hoping the new container and jetpack would have the same version of tensorrt.

Raymond_456 · May 17, 2022, 7:08am

Hello, I followed the pytorch tutorial to build pytorch1.9 on AGX Orin (jetpack 5.0) and got the following error.

[ 81%] Building NVCC (Device) object caffe2/CMakeFiles/torch_cuda.dir/operators/torch_cuda_generated_depthwise_3x3_conv_op_cudnn.cu.o /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(149): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(200): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(236): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(908): error: namespace “thrust” has no member “host_vector” /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(908): error: expected an expression /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(909): error: namespace “thrust” has no member “host_vector” /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(909): error: expected an expression /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(910): error: namespace “thrust” has no member “host_vector” /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(910): error: type name is not allowed /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(910): error: expected an expression /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(912): error: identifier “A_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(913): error: identifier “B_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(914): error: identifier “C_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(917): error: identifier “A_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(919): error: identifier “B_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(920): error: identifier “C_array” is undefined /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(1763): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(2234): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(2282): warning: the “visibility” attribute can only appear on functions and variables with external linkage /home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu(2846): warning: the “visibility” attribute can only appear on functions and variables with external linkage 13 errors detected in the compilation of “/home/py/Downloads/pytorch/caffe2/utils/math_gpu.cu”. CMake Error at torch_cuda_generated_math_gpu.cu.o.Release.cmake:281 (message): Error generating file /home/py/Downloads/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/utils/./torch_cuda_generated_math_gpu.cu.o make[2]: *** [caffe2/CMakeFiles/torch_cuda.dir/build.make:1388: caffe2/CMakeFiles/torch_cuda.dir/utils/torch_cuda_generated_math_gpu.cu.o] Error 1 make[2]: *** Waiting for unfinished jobs… make[1]: *** [CMakeFiles/Makefile2:6966: caffe2/CMakeFiles/torch_cuda.dir/all] Error 2 make: *** [Makefile:141: all] Error 2

dusty_nv · May 17, 2022, 2:15pm

Hi @bruce823ad, I would advise exporting to ONNX from PyTorch and then using the ONNX to build the TensorRT engine, as this is more future-proof and portable (not only across JetPack versions, but across different types of GPUs and Jetson devices)