Unable to get Pytorch vision & audio working on Jetpack 6.2

In summary after following the download and install instructions from the forum discussion listed below for pytorch, torchaudio, and torchvision, audio and vision are failing with the following two errors when attempting to include them

torchaudio:

OSError: libtorch_cuda.so: cannot open shared object file: No such file or directory

torchvision:

RuntimeError: operator torchvision::nms does not exist

See below for full details

Setup info:
Jetson Orin AGX 64GB developer edition
Jetpack 6.2

$ python --version
Python 3.10.16
$ sudo apt list --installed nvidia-jetpack
Listing... Done
nvidia-jetpack/stable,now 6.2+b77 arm64 [installed]
$ cat /etc/nv_tegra_release
# R36 (release), REVISION: 4.3, GCID: 38968081, BOARD: generic, EABI: aarch64, DATE: Wed Jan  8 01:49:37 UTC 2025
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia

I’m trying to follow the discussions on getting pytorch (including vision & audio) installed and working from here

pytorch and torchaudio downloads properly from the links in that post.

The torchvision dowload fails indicating the file is gone

$ python --version
Python 3.10.16
$ wget https://pypi.jetson-ai-lab.dev/jp6/cu126/+f/aa2/2da8dcf4c4c8d/torchvision-
0.21.0-cp310-cp310-linux_aarch64.whl#sha256=aa22da8dcf4c4c8dc897e7922b1ef25cb0fe350e1a358168be87a854ad114531
--2025-05-31 09:42:55--  https://pypi.jetson-ai-lab.dev/jp6/cu126/+f/aa2/2da8dcf4c4c8d/torchvision-0.21.0-cp310-cp310-linux_aarch64.whl
Resolving pypi.jetson-ai-lab.dev (pypi.jetson-ai-lab.dev)... 108.32.95.36
Connecting to pypi.jetson-ai-lab.dev (pypi.jetson-ai-lab.dev)|108.32.95.36|:443... connected.
HTTP request sent, awaiting response... 410 Gone
2025-05-31 09:42:56 ERROR 410: Gone.

To fix this I have substituted the current cu126 torchvision link instead and that downloaded properly, from

 wget https://pypi.jetson-ai-lab.dev/jp6/cu126/+f/daa/bff3a07259968/torchvision-0.22.0-cp310-cp310-linux_aarch64.whl#sha256=daabff3a0725996886b92e4b5dd143f5750ef4b181b5c7d01371a9185e8f0402

numpy is installed properly per the instructions.

When I move on from the steps to install in that forum post to verify the installation by starting python3 and using import for torchvision and torchaudio, I get the following errors.

$ python3
Python 3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchaudio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchaudio/__init__.py", line 2, in <module>
    from . import _extension  # noqa  # usort: skip
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchaudio/_extension/__init__.py", line 38, in <module>
    _load_lib("libtorchaudio")
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchaudio/_extension/utils.py", line 60, in _load_lib
    torch.ops.load_library(path)
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/_ops.py", line 1392, in load_library
    ctypes.CDLL(path)
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libtorch_cuda.so: cannot open shared object file: No such file or directory

and a different error for torchvision

$ python3
Python 3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchvision
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchvision/__init__.py", line 10, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils  # usort:skip
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/library.py", line 1023, in register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/library.py", line 214, in _register_fake
    handle = entry.fake_impl.register(func_to_register, source)
  File "/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/_library/fake_impl.py", line 31, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

It doesn’t take too long to compile torchvision or torchaudio. If do following within the venv where you have pytorch from pypi.jetson-ai-lab.dev/jp6/cu126/
the torchvision non-maximum suppression (NMS) library will be built for your environment.

Activate your conda/venv

git clone https://github.com/pytorch/vision.git
cd vision
git checkout release/0.21

sudo apt update
sudo apt install -y build-essential ninja-build cmake
libjpeg-dev libpng-dev libavcodec-dev libavformat-dev libswscale-dev


#Match the compiler that built PyTorch –- here assuming GCC 11

sudo apt install -y gcc-11 g+±11
export CC=gcc-11
export CXX=g+±11

export TORCH_CUDA_ARCH_LIST=ā€œ8.7ā€
export FORCE_CUDA=1
export MAX_JOBS=6
pip install --upgrade pip wheel ā€˜setuptools>=69’ cython

python3 -m pip wheel -v . --no-deps --no-build-isolation -w dist

pip install dist/torchvision-*.whl


Quick verification is attached as the following lost it’s formatting
nms_test_keeping_indents.txt (487 Bytes)
:

python3 - <<ā€˜EOF’
import torch, torchvision, os
print(ā€œTorch:ā€, torch.version, ā€œTorchvision:ā€, torchvision.version)
try:
from torchvision.ops import nms
boxes = torch.rand(1000,4,device=ā€˜cuda’) * 512
boxes[:,2:] += boxes[:,:2] # make (x2,y2) ≄ (x1,y1)
scores = torch.rand(1000,device=ā€˜cuda’)
keep = nms(boxes, scores, 0.5)
print(ā€œCUDA NMS succeeded – keptā€, keep.numel(), ā€œboxesā€)
except Exception as e:
print(ā€œNMS failed:ā€, e)
EOF

Ooh NICE thanks for the walkthrough, will step through that monday and reply back!

ok so I’ve confirmed that gcc 11.2 was used to compile the pytorch version from the post

(py3.10) $ python3
Python 3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__config__.show())
PyTorch built with:
  - GCC 11.2
  - C++ Version: 201703
  - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: DEFAULT
  - Build settings: BLAS_INFO=open, BUILD_TYPE=Release, COMMIT_SHA=134179474539648ba7dee1317959529fbd0e7f89, CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/c++, CXX_FLAGS=-ffunction-sections -fdata-sections -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_PYTORCH_QNNPACK -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=open, TORCH_VERSION=2.7.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

I’m not able to follow what you are providing with the example tho… I’m not following the g+±11 (±) portion of the name, I can’t find reference to that anywhere?

(py3.10) $ sudo apt install -y gcc-11 g+±11

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package g+±11

is that shorthand for g++-11? Using that worked

(py3.10) agrajag@tor-orin:~/vision$ sudo apt install -y gcc-11 g++-11
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
g++-11 is already the newest version (11.4.0-1ubuntu1~22.04).
g++-11 set to manually installed.
gcc-11 is already the newest version (11.4.0-1ubuntu1~22.04).
gcc-11 set to manually installed.
The following packages were automatically installed and are no longer required:
  gdal-data libarmadillo10 libarpack2 libblosc1 libcfitsio9 libcharls2 libdc1394-dev libdeflate-dev
  libdouble-conversion3 libexif-dev libfreexl1 libfyba0 libgdal30 libgdcm-dev libgdcm3.0 libgeos-c1v5
  libgeos3.10.2 libgeotiff5 libgl2ps1.4 libglew2.2 libgphoto2-dev libhdf4-0-alt libheif1
  libilmbase-dev libjbig-dev libkmlbase1 libkmldom1 libkmlengine1 liblept5 libminizip1
  libmysqlclient21 libnetcdf19 libodbc2 libodbcinst2 libogdi4.1 libopencv-calib3d4.5d
  libopencv-contrib4.5d libopencv-dnn4.5d libopencv-features2d4.5d libopencv-flann4.5d
  libopencv-highgui4.5d libopencv-imgcodecs4.5d libopencv-imgproc4.5d libopencv-ml4.5d
  libopencv-objdetect4.5d libopencv-photo4.5d libopencv-shape4.5d libopencv-stitching4.5d
  libopencv-superres4.5d libopencv-video4.5d libopencv-videoio4.5d libopencv-videostab4.5d
  libopencv-viz4.5d libopenexr-dev libpq5 libproj22 libraw1394-dev librttopo1 libsocket++1
  libspatialite7 libsuperlu5 libtbb-dev libtesseract4 libtiff-dev libtiffxx5 liburiparser1 libvtk9.1
  libxerces-c3.2 mysql-common proj-data unixodbc-common
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 267 not upgraded.

I’m realizing the stupid editor for the forum post collapses the +- to ± when you type it but does not when you paste it… wow

I’m using a backslash to force the dash after the plus in the post for anyone reading along… BOOOO whatever forum tool this is..

the forum editor is forcing smartquotes as well, not fun
pip install --upgrade pip wheel ā€˜setuptools>=69’ cython
vs

pip install --upgrade pip wheel 'setuptools>=69' cython

OK I’ll keep going with working on the compile.

status update:

I still ended up with the nms does not exist message after compiling vision and validating with the python3 import test steps, I’m going to fall back to compiling pytorch as the starting point on the system and then re-attempt vision.

For anyone following along;
Get tmux installed so it persists the build activity if your laptop goes to sleep in a remote session with your jetson.
tmux new -s buildtorch and then tmux attach -t buildtorch to maintain a persistent shell to do the compile. I learned that the hard way overnight.

make sure to clean up if you do abruptly end your compile before it completes by disconnecting or whatever too before re-attempting to start…

$ cd pytorch
$ rm -rf build/ dist/ *.egg-info

You can then run your build with confidence knowing if the SSH terminal closes for any reason, the build will continue.

I’m finding that I need to make sure to have CUDA_HOME or pass CUDA_HOME to the build command, as well…

$ CUDA_HOME=/usr/local/cuda-12.6  python3 -m pip wheel -v . --no-deps --no-build-isolation -w dist

I’m suspecting I’m pushing my luck by hard setting the cuda-12.6, but lets see if it works…

If compile fails you could try this for pytorch/pytorch. #I dropped \

sudo apt update && sudo apt install -y python3-dev libnccl2 libnccl-dev libopenblas-dev libblas-dev m4 cmake git libjpeg-dev libpng-dev libfreetype6-dev build-essential libffi-dev libssl-dev libprotobuf-dev protobuf-compiler libgoogle-glog-dev libgflags-dev libopenmpi-dev libomp-dev ninja-build nvidia-cudnn9 nvidia-cudnn9-dev

pip install --upgrade pip setuptools wheel

pip install numpy typing_extensions requests pyyaml sympy ninja fsspec future six jinja2 hypothesis pytest

export USE_CUDA=1
export USE_CUDNN=1
export USE_NCCL=1
export TORCH_CUDA_ARCH_LIST=ā€œ8.7ā€
export MAX_JOBS=6

python3 -m pip wheel -v . --no-deps --no-build-isolation -w dist

pip install dist/*.whl

Thanks, torch did compile successfully, had to work on other projects today, will review your steps to make sure I didnt miss anything & will resume tomorrow.

Out of curiosity;

What repo do you have added that is providing libnccl2 and libnccl-dev? I error out on those immediately?

Package libnccl2 is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Package 'libnccl2' has no installation candidate
E: Unable to locate package libnccl-dev

Never mind, found it here

I was working on a Dockerfile today, and discovered that for the pytorch2.7.whl I compiled in March, the docker container I installed it in wanted nccl, cublas, cusparselt and some other nvidia math and communication libraries.

I think the nvidia cu12 python packages may be sufficient pypi.nvidia.com


Below is how I installed full apt arm sbsa download
pick the nccl arm download that corresponds to your cuda version

For a downloaded nccl*.deb here’s the nvidia install guide

Thank you, I’ve gotten things working, I had to use the local repo setup to be able to download and install NCCL from the Log in | NVIDIA Developer page (neither the current or archived network installer would work, following the instructions there).

Use tmux to create a persisted shell, the pytorch build takes a long time.

$ tmux new -s buildtorch
# and if you timeout/exit ssh shell later on, to reattach
$ tmux attach -t buildtorch

You have to enter your conda env and set environment after this

I hit a few errors, and I had evolved while troubleshooting to the following environment in place to successfully build pytorch first from source and then vision. Note I was compiling directly on the jetson AGX Orin 64GB unit.

export CC=gcc-11
export CXX=g++-11
export CUDA_HOME=/usr/local/cuda-12.9
export FORCE_CUDA=1
export USE_CUDA=1 #probably redundant
export USE_CUDNN=1
export TORCH_CUDA_ARCH_LIST="8.7"  
export USE_NEON=1 
export USE_NCCL=1
export MAX_JOBS=6
USE_PRIORITIZED_TEXT_FOR_LD=1 #errors I was getting 
# the following is probably over-configuration, specifying my conda env
CMAKE_PREFIX_PATH=/opt/miniconda3/envs/py3.10

# and then the recommended build and install commands

$ python3 -m pip wheel -v . --no-deps --no-build-isolation -w dist

$ pip install dist/*.whl

The test of the build/install of pytorch & vision results:

$ python3
Python 3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchvision
>>> import os
>>> print("Torch:", torch.version, "Torchvision:", torchvision.version)
Torch: <module 'torch.version' from '/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/version.py'> Torchvision: <module 'torchvision.version' from '/opt/miniconda3/envs/py3.10/lib/python3.10/site-packages/torchvision/version.py'>
# and the suggested validation test in python
>>> from torchvision.ops import nms
>>> boxes = torch.rand(1000,4,device='cuda') * 512
>>> boxes[:,2:] += boxes[:,:2] # make (x2,y2) ≄ (x1,y1)
>>> scores = torch.rand(1000,device='cuda')
>>> keep = nms(boxes, scores, 0.5)
>>> print("CUDA NMS succeeded – kept", keep.numel(), "boxes")
CUDA NMS succeeded – kept 434 boxes

in summary (and of note)

as stated in my previous update, always clear the build attempt when hitting errors and failing from within your source code path

$ rm -rf build/ dist/ *.egg-info

Also of note for publishing in this forum tool; Pasting examples of code and configuration blocks into this forum tool generates ā€œsmart quotesā€ around the code for both single and double quotes, using code blocks to present examples prevents this.

Thanks @whitesscott for your help!