Hello AI World for Jetpack 6.0 DP - Pytorch 2.1.0 Installed, Torchvision Did Not Install

I have an Nvidia Orin Nano Dev Kit with Jetpack 6.0 DP. I installed @dusty_nv’s Hello AI World project by building it from the source. No problems with that. Skipped the Pytorch installation step. Imagenet examples worked, so the project build was good.

Problem: In the Transfer Learning with Pytorch section, I attempted to install Pytorch via ./install-pytorch.sh. Pytorch installed, Torchvision did not. An excerpt of the output shown below. More here: log.txt (41.1 KB)


File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 525, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 413, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.2). Please make sure to use the same CUDA versions.

.
Started python3 to verify the installation:

:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.version
‘2.1.0’
torch.cuda.is_available()
True
import torchvision
Traceback (most recent call last):
File “”, line 1, in
ModuleNotFoundError: No module named ‘torchvision’

I followed the Hello AI World steps. But there’s a compatibility issue here. How do I install torchvision that is compatible with Jetpack 6.0 DP and the latest Hello AI Work project?

@xplanescientist it didn’t build because somehow you have CUDA 11.5 on your system, even though JetPack 6.0 DP ships with CUDA 12.2, which is what that PyTorch wheel that it installed was built against.

What does ls -ll /usr/local/cuda show? How did you get JetPack 6 on this system?

I did explicitly update and test install-pytorch.sh for JetPack 6, and built the r36.2.1 container for jetson-inference as well (you could try that in the meantime)

@dusty_nv , I installed JetPack 6.0 DP yesterday via the SD card image method from the nvidia jetpack page. A day prior, I flashed the QSPI bootloader on my new Nvidia Orin Nano Dev Kit.

The line ls -ll /usr/local shows:

drwxr-xr-x 11 root root 4096 Nov 30 16:33 ./
drwxr-xr-x 11 root root 4096 Feb 17 2023 …/
drwxr-xr-x 2 root root 4096 Jan 5 14:52 bin/
lrwxrwxrwx 1 root root 22 Nov 30 16:33 cuda → /etc/alternatives/cuda/
lrwxrwxrwx 1 root root 25 Nov 30 16:33 cuda-12 → /etc/alternatives/cuda-12/
drwxr-xr-x 12 root root 4096 Nov 30 16:33 cuda-12.2/
drwxr-xr-x 2 root root 4096 Feb 17 2023 etc/
drwxr-xr-x 2 root root 4096 Feb 17 2023 games/
drwxr-xr-x 4 root root 4096 Jan 5 14:31 include/
drwxr-xr-x 4 root root 4096 Jan 5 14:31 lib/
lrwxrwxrwx 1 root root 9 Feb 17 2023 man → share/man/
drwxr-xr-x 2 root root 4096 Feb 17 2023 sbin/
drwxr-xr-x 9 root root 4096 Jan 5 14:31 share/
drwxr-xr-x 2 root root 4096 Feb 17 2023 src/

As I said, I followed your Hello AI World instructions this morning and built the project from the source thinking that approach would handle compatibility issues.

I’ll try the container you cited.

@dusy_nv, I started from scratch with a new Jetpack 6.0 DP image. I have plenty of blank SD cards.

I went through the Hello AI World build-from-source again because I’m familiar with this approach. After the cmake ../ step in Configuring with CMake, the output in the terminal shows the following. More here: log_cmake.txt (51.9 KB)

File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 525, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 413, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.2). Please make sure to use the same CUDA versions.
[jetson-inference] installation complete, exiting with status code 0
[jetson-inference] to run this tool again, use the following commands:
$ cd /build
$ ./install-pytorch.sh
[Pre-build] Finished CMakePreBuild script
– Finished installing dependencies
– using patched FindCUDA.cmake
– Looking for pthread.h
– Looking for pthread.h - found
– Performing Test CMAKE_HAVE_LIBC_PTHREAD
– Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
– Found Threads: TRUE
– Found CUDA: /usr (found version “11.5”)
– CUDA version: 11.5
– CUDA 11.5 detected (aarch64), enabling SM_53 SM_62
– CUDA 11.5 detected (aarch64), enabling SM_72
– CUDA 11.5 detected (aarch64), enabling SM_87
– Found OpenCV: /usr (found version “4.8.0”) found components: core calib3d
– OpenCV version: 4.8.0
– OpenCV version >= 3.0.0, enabling OpenCV
CMake Warning at CMakeLists.txt:106 (find_package):
Could not find a configuration file for package “VPI” that is compatible
with requested version “2.0”.
The following configuration files were considered but not accepted:
/usr/lib/cmake/vpi3/vpi-config.cmake, version: 3.0.10
/lib/cmake/vpi3/vpi-config.cmake, version: 3.0.10

.
But the imaged 6.0 SD card only has cuda12. The cmake process is confused with cuda versions. `ll /usr/local’ shows this:

jet@sky:~$ ll /usr/local
drwxr-xr-x 11 root root 4096 Nov 30 16:33 ./
drwxr-xr-x 11 root root 4096 Feb 17 2023 …/
drwxr-xr-x 2 root root 4096 Jan 6 01:50 bin/
lrwxrwxrwx 1 root root 22 Nov 30 16:33 cuda → /etc/alternatives/cuda/
lrwxrwxrwx 1 root root 25 Nov 30 16:33 cuda-12 → /etc/alternatives/cuda-12/
drwxr-xr-x 12 root root 4096 Nov 30 16:33 cuda-12.2/
drwxr-xr-x 2 root root 4096 Feb 17 2023 etc/
drwxr-xr-x 2 root root 4096 Feb 17 2023 games/
drwxr-xr-x 4 root root 4096 Jan 6 01:50 include/
drwxr-xr-x 4 root root 4096 Jan 6 01:50 lib/
lrwxrwxrwx 1 root root 9 Feb 17 2023 man → share/man/
drwxr-xr-x 2 root root 4096 Feb 17 2023 sbin/
drwxr-xr-x 9 root root 4096 Jan 6 01:50 share/
drwxr-xr-x 2 root root 4096 Feb 17 2023 src/

.
I went through the remaining steps in Compiling the Project to complete the AI World install. Then I circled back to the pytorch installation tool:

cd jetson-inference/build
./install-pytorch.sh

.
Only 1 package was listed for installation, `PyTorch 2.1 for Python 3.10’. I selected that again, Entered to continue, and it outputted this:

File “/usr/lib/python3.10/distutils/command/build_ext.py”, line 340, in run
self.build_extensions()
File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 525, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File “/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py”, line 413, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.2). Please make sure to use the same CUDA versions.
[jetson-inference] installation complete, exiting with status code 0
[jetson-inference] to run this tool again, use the following commands:

.
It still thinks cuda 11.5 is installed, even though ll /usr/loca still shows cuda 12.2:

drwxr-xr-x 11 root root 4096 Nov 30 16:33 ./
drwxr-xr-x 11 root root 4096 Feb 17 2023 …/
drwxr-xr-x 2 root root 4096 Jan 6 01:50 bin/
lrwxrwxrwx 1 root root 22 Nov 30 16:33 cuda → /etc/alternatives/cuda/
lrwxrwxrwx 1 root root 25 Nov 30 16:33 cuda-12 → /etc/alternatives/cuda-12/
drwxr-xr-x 12 root root 4096 Nov 30 16:33 cuda-12.2/
drwxr-xr-x 2 root root 4096 Feb 17 2023 etc/

.
I checked in python3 session, and of course Torchvision is not installed:

jet@sky:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.version
‘2.1.0’
torch.cuda.is_available()
True
import torchvision
Traceback (most recent call last):
File “”, line 1, in
ModuleNotFoundError: No module named ‘torchvision’

.
Is the cmake step the problem? Or the 6.0 DP image?

Hmm very strange - the actual procedure that install_pytorch.sh does isn’t very complicated and mirrors that from PyTorch for Jetson. Are you able to install PyTorch/torchvision yourself that way, or encounter the same CUDA 11.5 error? I don’t know the source of that error at this point, since your ll /usr/local does indeed show only CUDA 12.2 is installed (as it should be). Did the container end up working for you? That could help rule out underlying platform/driver issues.

I tried installing pytorch/torchvision with the Pip Wheel approach you suggested. But no luck. See output:

jet@sky:~$ pip3 install numpy Downloads/torch-2.1.0-cp310-cp310-linux_aarch64.whl
Defaulting to user installation because normal site-packages is not writeable
Processing ./Downloads/torch-2.1.0-cp310-cp310-linux_aarch64.whl
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (1.21.5)
Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from torch==2.1.0) (1.9)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0) (3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0) (3.1.2)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0) (2023.12.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0) (4.9.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0) (3.13.1)
Requirement already satisfied: MarkupSafe>=2.0 in ./.local/lib/python3.10/site-packages (from jinja2->torch==2.1.0) (2.1.3)
Installing collected packages: torch
Successfully installed torch-2.1.0
jet@sky:~$ cd
jet@sky:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.cuda.is_available()
True

import torchvision
Traceback (most recent call last):
File “”, line 1, in
ModuleNotFoundError: No module named ‘torchvision’

.
Anything else I can try before looking into the container approach. I don’t wholly understand docker/containers, I need to read into it further. Seems like anaconda but for linux.

It looks like you did the PyTorch installation, but not the torchvision installation. See the > torchvision section underneath Installation from that PyTorch topic. For PyTorch 2.1, you’ll want to clone torchvision 0.16.1

I missed the torchvision section in Installation. I was under the impression installing pytorch automatically installs torchvision just as the install-pytorch.sh tool does.

Okay, I went through this process for torchvision v0.16.1 (compatible with pytorch 2.1):

sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libopenblas-dev libavcodec-dev libavformat-dev libswscale-dev
git clone --branch v0.16.1 https://github.com/pytorch/vision torchvision
cd torchvision
export BUILD_VERSION=0.16.1
python3 setup.py install --user

But I ran into the same problem - the installer thinks cuda 11.5 is installed when in fact cuda 12.2 is.

File “/home/jet/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py”, line 413, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.2). Please make sure to use the same CUDA versions.

Oddly enough, when I check installations in python, torchvision seems to be in there but incomplete. I say incomplete because I’m unable to check torchvision version, and version is a common attribute. Other attributes may be missing. This is not a reliable installation.

jet@sky:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.version
‘2.1.0’
torch.cuda.is_available()
True

import torchvision
torchvision.version
Traceback (most recent call last):
File “”, line 1, in
AttributeError: module ‘torchvision’ has no attribute ‘version

torchvision.__ torchvision.__annotations__ torchvision.__dir__( torchvision.__format__( torchvision.__hash__() torchvision.__loader__ torchvision.__new__( torchvision.__reduce_ex__( torchvision.__spec__ torchvision.__class__( torchvision.__doc__ torchvision.__ge__( torchvision.__init__( torchvision.__lt__( torchvision.__package__ torchvision.__repr__() torchvision.__str__() torchvision.__delattr__( torchvision.__eq__( torchvision.__getattribute__( torchvision.__init_subclass__( torchvision.__name__ torchvision.__path__ torchvision.__setattr__( torchvision.__subclasshook__( torchvision.__dict__ torchvision.__file__ torchvision.__gt__( torchvision.__le__( torchvision.__ne__( torchvision.__reduce__() torchvision.__sizeof__()

Is there a Jetpack image available that includes all the Jetpack 6.0 DP parts, plus pytorch and torchvision? So that people like me can start going through the Hello AI World package including the Transfer Learning section.

There isn’t, for that please use the jetson-inference container, which already includes that stuff pre-installed. I’m at a loss for why it keeps detecting CUDA 11.5 on your system because I’ve done the exact same process on multiple JetPack 6.0 machines and also you are the first to report it…

@dusty_nv, does the order in which elements are installed matter? I started from scratch several times. This is the exact order:

  1. Install Jetpack 6.0 DP on SD card via the SD Card Image Method from Win10 PC
  2. sudo apt update andsudo apt upgrade
  3. Install Hello AI World package by building the project from source
  4. Install ptorch/torchvision with ./install-pytorch.sh per the directions therein

I own several old Jetson Nanos, and that’s the approach that worked every time. I don’t understand why the Orin is more difficult.

That’s the same order that I do, minus the ‘apt upgrade’ - maybe that is introducing some issue with CUDA? I haven’t seen it reported otherwise though. You could also try a slightly older version of torchvision (like 0.15 instead) to see if it’s specific to that. Although I’ve started from the same steps (except the upgrade part) and not encountered this, sorry about that.

Okay, I started from scratch again, but without the sudo apt upgrade. The CUDA version mismatch error persists and shows up in cmake ../ again. The mismatch error is cmake detects 11.3, but /usr/local shows cuda 12.2 installed as expected with Jetpack 6.0 DP.

Digging further in the ‘jetson-inference’ directory, a recursive grep shows cuda 11.5 show up as an argument in many build files:

jet@sky:~/jetson-inference$ grep -r “11.5” . | grep CUDA

./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_depthNet.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_tensorConvert.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_tensorConvert.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_detectNet.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_segNet.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_depthNet.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_detectNet.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_segNet.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_backgroundNet.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/CMakeFiles/jetson-inference.dir/c/jetson-inference_generated_backgroundNet.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-intrinsic.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaNormalize.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaGrayscale.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-intrinsic.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaColormap.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-YUYV.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaFilterMode.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaRGB.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaOverlay.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-affine.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-NV12.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaDraw.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaCrop.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaDraw.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaGrayscale.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaFont.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaRGB.cu.o.cmake:set(CUDA_VERSION 11.5)
grep: ./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-YUYV.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/torch-2.1.0-cp310-cp310-linux_aarch64.whl: binary file matches./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-fisheye.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaResize.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaOverlay.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-YV12.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaResize.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-NV12.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaFilterMode.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-affine.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-YV12.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaColormap.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaPointCloud.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaCrop.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaNormalize.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaPointCloud.cu.o.cmake:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaWarp-fisheye.cu.o.cmake.pre-gen:set(CUDA_VERSION 11.5)
./build/utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaFont.cu.o.cmake:set(CUDA_VERSION 11.5)
grep: ./data/images/drone_0435.png: binary file matches./build/CMakeCache.txt:CUDA_VERSION:STRING=11.5
./build/CMakeCache.txt:FIND_PACKAGE_MESSAGE_DETAILS_CUDA:INTERNAL=[/usr][/usr/bin/nvcc][/usr/include][/usr/lib/aarch64-linux-gnu/libcudart_static.a][v11.5()]

.

The cmake files need to be updated because 11.5 appears to be hardcoded improperly?

The CMakeLists configuration in jetson-inference are unrelated to those that torchvision uses. So changing jetson-inference won’t change torchvision. However you can try commenting out this line of code in jetson-inference, removing your build directory, and rebuilding:

At this point, I would recommend using the jetson-inference container for running your PyTorch training. If you prefer, you can still use jetson-inference outside of docker normally for the inferencing part. jetson-inference does not depend on PyTorch/torchvision for runtime, just the training part of the tutorials (and the jetson-inference code is separate from the PyTorch training code)

The problem with the container approach is that it eats into the precious memory and all the memory is needed for the inference models. Even if you get a largish computer vision model working in your available memory the actual virtual memory will likely still be greater than the physical memory on your computer, likely it will only work because it’s able to get away with the resident memory being small enough. For example, to run YoloV6 on an Orin NX the resident memory is only about 3.9GB, but the virtual memory is around 19GB for me. The last thing you want to do is waste memory on containers.

The only problem with containers is that if you really want your Jetson device on the edge to be production ready, then you need to run with a memory OverlayFS. You cannot run a docker container on top of this, docker also uses overlays and these cannot run on top of a memory overlay. To get around this you need to have the docker directory running on real disk not the overlay. But there is no real disk in that situation. So you need to changing the partitioning and migrate docker to use one of the real partitions. You still haven’t solved that docker is eating your precious memory. My project creates new partitions so that you can use a memory OverlayFS on the main rootfs but you have separate real disk available that you could in principle migrate your docker containers to so that you have the best of both works. I didn’t like having to use Docker because of the waste in memory resources so I built Pytorch and PyVision from source so I didn’t have to. I also did that quite a while ago when Pytorch and friends were not yet compatible with the Orin platform so I found and fixed what was necessary to make it use the platform correctly btw.

And then there is the partitioning. The partitioning setup on the Jetsons is really complex. There are many partitions. Mostly these are installed high now so you can shrink the rootfs, but if you want to move the other partitions then you might have issues with the UEFI boot, you have to be careful. And if this is the only Linux computer you have then it gets even more complex.

I realize that this is work for NVidia that is intended to be done by the third party suppliers in principle, but in the case of computer vision a lot of pain for people could be saved if not only pytorch but compatible pyvision builds could be provided.

Now, aside from all of the problems above. I have actually solved all of these problems in a project that I’ll be releasing an Orin version for shortly :) In principle it could be used as a base for personal projects. But I can’t pre-release it. I just wanted to point out that the current development environment for computer vision projects must be extremely frustrating for people unless they have very deep Unix/Linux knowledge. Mostly that’s not the case. Very few Linux people are comfortable with playing around with low level partitioning tricks. That’s why I do all of this in the install code for my project (sbts-install).

However, for those that are interested, the Orin versions of compatible wheels for Jetpack 5.x that will be used for my project are available pre-built here;

and the page with links to versions for both Jetpack 5.x and Jetpack 4.6 is thus here:

They are fine for running YoloV6 and YoloV7 on out of the box. So long as you also install the mish-cuda.

On the subject of Ultralytics and pytorch and pyvision libraries and their versions. Recently I needed to run a model that needed Ultralytics YoloV8. I had compatible versions of Pytorch and Pyvision but not the ones Ultralytics wanted. So I just changed the requirements.txt file of the virtualenv of Ultralytics to allow the ones I had and then YoloV8 just worked.

Have you tried it on an Orin Nano? I have not gotten to your suggestion about commenting out line #66 in the CMakeLists.txt file - I will get to that soon enough. I have a strong preference to get pytorch/torchvision working without containers to preserve RAM as much as possible as @KimHendrikse pointed out.

If the CMakeLists.txt suggestion does not work, then I’ll try again from scratch (new SD card), install pytorch/torchvision first with Pip Wheels, then build the Hello AI World project from source.

Containers are not virtual machines (they run natively in Linux Cgroups) and the last time I checked, there was only a few megabytes of difference in memory usage. That marginal difference is far outweighed by the complexity of managing complex AI/ML software stacks. And I always put the docker data root on NVME.

Yes, I have tried jetson-inference on JetPack 6 on Orin Nano and not encountered any issues building it and installing PyTorch. Changing the jetson-inference CMakeLists will not impact the PyTorch setup, totally different codebases.

And if you are really concerned about optimizing memory usage, you should run the models through TensorRT for inferencing, not PyTorch.

Maybe one day I will try tensorRT. But at the moment it’s extra things to learn and I have so much to do. I tried another migrated model in the past that was supposed to be equivalent, and the results were very different. I’m not saying it’s going to be the case with tensorRT. I have no experience with it, but yes it’s an unknown. Maybe it’s time for me to learn.

Are you saying that if I converted the large model YoloV6 to tensorRT and ran it through the tests the results would be identical AP scores as when the model runs on PyTorch? Because if it reduces the score then it’s not good enough.

If you can absolutely say that there would be no degradation in precision and recall against the coco dataset by converting the model then I’ll put it on my list to try. But if it quantises the weights of if it’s not identical in terms of all the operations then I can’t see this not having a negative affect on precision and recall.

I know how docker containers work, and my comments are based upon attempting to run a pytorch based model on a 4 GB Jetson and docker pushed it over the limit so it wouldn’t work. From memory the large model Yolov6 uses something like 3.6G resident memory. If I run two such models but only have 8G memory then even hundreds of MBs are two much.

I agree that virtualisation can make some things a lot easier from a maintenance perspective. But it’s not like it’s not possible to build compatible PyTorch and PyVision for the Jetson platform as I’ve done it myself and published the wheels.

There are a huge amount of users that can benefit from compatible PyTorch and PyVision wheels so for the community it would be worth the effort I think.

If I’ve got a 16 GB machine, then it’s going to be less of a squeeze. But there are also 8GB machines. Even a 16GB machine might have trouble if you want to run three or four models.

My security system sbts-install is a multi-model system so I can make use of multiple models at the same time, but you have to have enough memory for them. Unnecessary waste is just a waste.

But you skipped over the memory OverlayFS part very quickly. If I put a Jetson in the middle of some wildlife area as it comes out of the box and a power cut forces it to want to interact with a console with an interactive fsck command what’s going to happen ? There’s no frame buffer console so connecting a screen and keyboard is not going to help. You would have to connect another computer via a special cable to the pins. Can you see a biologist doing that ? Most likely the unit would have to be shipped back to civilisation with a lot of pissed off people left behind.

And when working with a memory overlayFS, docker is yet another thing that you have to deal with as it won’t run on top of a memory overlayFS. And in my opinion a system that doesn’t protect against disk corruption of the rootfs is not production ready.

Anyway, I’ll leave it here as my sbts-install project already takes care of it, but I see a lot of others confronted with this time and time again.

Just trying to make a helpful suggestion :-)

Kim

If you run TensorRT in FP16 mode there is typically no appreciable difference in accuracy vs PyTorch FP16, and if something is broken we fix bugs in TensorRT. You can also just run TensorRT with full FP32 precision. PyTorch takes up a lot of extra memory (~1GB IIRC) just to import it and load all the CUDA kernels it compiles (many of which may not even be used in your pipeline). INT8 quantization requires dataset calibration and depending on the quality you may see some drop-off (although TAO Toolkit is very good at doing this). Also there are a lot of TensorRT examples out there for YOLO.

The PyTorch wheels are released on our website, and you can extract the torchvision wheels from my torchvision container (they are saved under /opt inside the container and you can copy them out and install them on the host device, and then not use the container anymore)

Also there are a growing number of PyTorch add-on libraries like torchaudio, torchtext, ect and it’s unrealistic for us to officially release wheels for all of these in perpetuity, however I do build them in the containers because that’s automated.

Admittedly I don’t have experience with this and you very well may have a point, however what if you just made the host filesystem read-only? And restart the container on reboots. Regardless, this topic isn’t really a discussion about container vs. not, I like to support both ways of course depending on user preferences. Personally yea I use them because I maintain a ton of code and complex builds, and need automation of that to scale and keep the host OS environment clean/sane.

In this particular case, I recommended trying the container even for just PyTorch training, as @xplanescientist is stuck but I’m unable to reproduce it and this issue hasn’t been reported even though many folks have built torchvision on JetPack 6. And almost everyone in HPC environments are using containers for training too because the dependencies get complex.

At a later stage I’ll come back to looking into this then. My priority now is getting the Orin install release code ready. Much of the complexity of this for me is because the Seeed Studio recomputer left off the SD card. My install approach involved changing the partition ing. with an SD card present, so long as there’s an image for it however. I could migrate the release to an nvme.

Without an SD card I can’t do this. So another computer is needed. Admittedly if you need to flash a new image say because you want a bigger SSD, which I always would, you need another computer as well. But to change the partioning you need extra hardware, you need an nvme USB adapter. I’d like to avoid the need for extra hardware.

I’ve found a way to changing the partitioning of a disk without requiring extra hardware, which I’ll implement for my install script. But boy, it’s not easy, I have to create a memory disk, pivot the root, change the partitioning, pivot the root again and then continue the boot. But I made this work for the the Raspberry Pi for my sbts-aru Sound Localizing recorder project, so I can make it work for the Jetson.

And I’m making my project is two parts so that people can use the re-partitioning/OverylayFS project to productionize their own projects.

Later this year when I have to start training my models anyway I’ll come back and have a look at the other frameworks for more memory efficiency.

One can make docker work with this approach as well if memory is not a concern, you just have to place the docker directory on the new read/write partition. I’m not doing this yet because I need the memory.

Kim

I started from scratch, commented out line #66 as you suggested, ran ../cmake, but no luck. The cuda mismatch error persists:

RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.2). Please make sure to use the same CUDA versions.
[jetson-inference] installation complete, exiting with status code 0
[jetson-inference] to run this tool again, use the following commands:
$ cd /build
$ ./install-pytorch.sh
[Pre-build] Finished CMakePreBuild script
– Finished installing dependencies
– Looking for pthread.h
– Looking for pthread.h - found
– Performing Test CMAKE_HAVE_LIBC_PTHREAD
– Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
– Found Threads: TRUE
– Found CUDA: /usr (found version “11.5”)
– CUDA version: 11.5
– CUDA 11.5 detected (aarch64), enabling SM_53 SM_62
– CUDA 11.5 detected (aarch64), enabling SM_72
– CUDA 11.5 detected (aarch64), enabling SM_87

.
I’ll try one more idea before moving on to the containers. I’ll start with a clean slate (re-image SD card) and install pytorch/torchvision with Pip Wheels before building the Hello AI World project.