Hi Johnny is there a fixed Torch wheel now? if so where can we find it ?
Hi,
We have uploaded the legacy wheels in
Please refer
Thanks
Hi,
These pytorch and torchvision wheels worked for me. But Iβm unsure how many days/weeks it works as many users who had installed previous versions faced errors after few weeks. So, if I find issues again, Iβll definitely create a new topic again stating the issue.
For now, thank you very much for quick resolution of this issue. :)
βTo all others, this solution worked on Sep 3 2025 but not sure for how long it might work. I request NVIDIA engineers to make a proper step by step guide for building torch and torchvision including solving known errors as soon as possible.β
Hi David,
I get the following error when I try to install the Torch wheels.
$ pip install torch-2.8.0-cp310-cp310-linux_aarch64-2.whl
ERROR: Invalid build number: cp310 in βtorch-2.8.0-cp310-cp310-linux_aarch64-2β
hillary@ubuntu:~/torch_wheels$
Platform Machine: aarch64
System: Linux Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15. 148- tegra
Python: 3.10.12
Libraries
CUDA: 12.6.85
CUDNN: 9.4.0
TensorRT : 10.7.0.23
VPI: 3.2.4
Vulkan: 1.3.204
OpenCV: 4.11.0 with CUDA: VES
Serial Number: 1421424266433
Hardware Model: NVIDIA Jetson AGX Orin Developer Kit
P-Number: p3701-0005
Module: NVIDIA Jetson AGX Orin (64GB ram)
Soc: tegra234
CUDA Arch BIN: 8.7
L4T: 36.4.4
Jetpack: 6.2.1
it is on pypi
pip3 install --force-reinstall torch --index-url https://pypi.jetson-ai-lab.io/jp6/cu126
johnny@johnny-jetson:~/Projects/jetson-containers$ python3 -c 'import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.backends.cudnn.version()); print(torch.__config__.show());'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/johnny/.local/lib/python3.10/site-packages/torch/__init__.py", line 416, in <module>
from torch._C import * # noqa: F403
ImportError: libcudss.so.0: cannot open shared object file: No such file or directory
johnny@johnny-jetson:~/Projects/jetson-containers$ wget https://developer.download.nvidia.com/compute/cudss/0.6.0/local_installers/cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo dpkg -i cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo cp /var/cudss-local-tegra-repo-ubuntu2204-0.6.0/cudss-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudss
--2025-09-03 12:37:26-- https://developer.download.nvidia.com/compute/cudss/0.6.0/local_installers/cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 23.33.143.170, 23.33.143.138
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|23.33.143.170|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33202980 (32M) [application/x-deb]
Saving to: βcudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.debβ
cudss-local-tegra-repo-ubun 100%[=========================================>] 31,66M 151MB/s in 0,2s
2025-09-03 12:37:28 (151 MB/s) - βcudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.debβ saved [33202980/33202980]
[sudo] password for johnny:
Selecting previously unselected package cudss-local-tegra-repo-ubuntu2204-0.6.0.
(Reading database ... 226735 files and directories currently installed.)
Preparing to unpack cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb ...
Unpacking cudss-local-tegra-repo-ubuntu2204-0.6.0 (0.6.0-1) ...
Setting up cudss-local-tegra-repo-ubuntu2204-0.6.0 (0.6.0-1) ...
The public cudss-local-tegra-repo-ubuntu2204-0.6.0 GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cudss-local-tegra-repo-ubuntu2204-0.6.0/cudss-local-tegra-72E455D5-keyring.gpg /usr/share/keyrings/
Get:1 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 InRelease [1.572 B]
Get:1 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 InRelease [1.572 B]
Hit:3 https://download.docker.com/linux/ubuntu jammy InRelease
Hit:4 https://repo.download.nvidia.com/jetson/common r36.4 InRelease
Hit:2 https://apt.llvm.org/jammy llvm-toolchain-jammy-17 InRelease
Hit:5 http://ports.ubuntu.com/ubuntu-ports jammy InRelease
Hit:6 https://repo.download.nvidia.com/jetson/t234 r36.4 InRelease
Hit:7 https://ppa.launchpadcontent.net/obsproject/obs-studio/ubuntu jammy InRelease
Hit:8 https://repo.download.nvidia.com/jetson/ffmpeg r36.4 InRelease
Get:9 http://ports.ubuntu.com/ubuntu-ports jammy-updates InRelease [128 kB]
Get:10 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 Packages [1.535 B]
Get:11 https://packages.microsoft.com/repos/code stable InRelease [3.590 B]
Get:12 https://pkgs.tailscale.com/stable/ubuntu jammy InRelease
Hit:13 https://apt.kitware.com/ubuntu jammy InRelease
Get:14 http://ports.ubuntu.com/ubuntu-ports jammy-backports InRelease [127 kB]
Get:15 http://ports.ubuntu.com/ubuntu-ports jammy-security InRelease [129 kB]
Get:16 https://packages.microsoft.com/repos/code stable/main arm64 Packages [20,4 kB]
Get:17 https://packages.microsoft.com/repos/code stable/main armhf Packages [20,5 kB]
Get:18 https://packages.microsoft.com/repos/code stable/main amd64 Packages [20,3 kB]
Get:19 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 Packages [2.685 kB]
Get:20 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 DEP-11 Metadata [112 kB]
Get:21 http://ports.ubuntu.com/ubuntu-ports jammy-updates/restricted arm64 DEP-11 Metadata [212 B]
Get:22 http://ports.ubuntu.com/ubuntu-ports jammy-updates/universe arm64 Packages [1.238 kB]
Get:23 http://ports.ubuntu.com/ubuntu-ports jammy-updates/universe arm64 DEP-11 Metadata [356 kB]
Get:24 http://ports.ubuntu.com/ubuntu-ports jammy-updates/multiverse arm64 DEP-11 Metadata [212 B]
Get:25 http://ports.ubuntu.com/ubuntu-ports jammy-backports/main arm64 DEP-11 Metadata [3.580 B]
Get:26 http://ports.ubuntu.com/ubuntu-ports jammy-backports/restricted arm64 DEP-11 Metadata [212 B]
Get:27 http://ports.ubuntu.com/ubuntu-ports jammy-backports/universe arm64 DEP-11 Metadata [25,6 kB]
Get:28 http://ports.ubuntu.com/ubuntu-ports jammy-backports/multiverse arm64 DEP-11 Metadata [212 B]
Get:29 http://ports.ubuntu.com/ubuntu-ports jammy-security/main arm64 DEP-11 Metadata [54,5 kB]
Get:30 http://ports.ubuntu.com/ubuntu-ports jammy-security/restricted arm64 DEP-11 Metadata [208 B]
Get:31 http://ports.ubuntu.com/ubuntu-ports jammy-security/universe arm64 DEP-11 Metadata [125 kB]
Get:32 http://ports.ubuntu.com/ubuntu-ports jammy-security/multiverse arm64 DEP-11 Metadata [208 B]
Fetched 5.056 kB in 6s (791 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
cmake-data dh-elpa-helper gdal-data libaec0 libarmadillo10 libarpack2 libavcodec-dev libavformat-dev
libavutil-dev libblosc1 libcfitsio9 libcharls2 libdc1394-dev libdeflate-dev libexif-dev libfreexl1
libfyba0 libgdal30 libgdcm-dev libgdcm3.0 libgeos-c1v5 libgeos3.10.2 libgeotiff5 libgl2ps1.4 libglew2.2
libgphoto2-dev libhdf4-0-alt libhdf5-103-1 libhdf5-hl-100 libheif1 libilmbase-dev libjbig-dev libjpeg-dev
libjpeg-turbo8-dev libjpeg8-dev libjsoncpp25 libkmlbase1 libkmldom1 libkmlengine1 liblept5 libminizip1
libmysqlclient21 libnetcdf19 libodbc2 libodbcinst2 libogdi4.1 libopencv-calib3d4.5d libopencv-contrib4.5d
libopencv-dnn4.5d libopencv-features2d4.5d libopencv-flann4.5d libopencv-highgui4.5d
libopencv-imgcodecs4.5d libopencv-imgproc4.5d libopencv-ml4.5d libopencv-objdetect4.5d libopencv-photo4.5d
libopencv-shape4.5d libopencv-stitching4.5d libopencv-superres4.5d libopencv-video4.5d
libopencv-videoio4.5d libopencv-videostab4.5d libopencv-viz4.5d libopenexr-dev libpq5 libproj22
libraw1394-dev librhash0 librttopo1 libsocket++1 libspatialite7 libsuperlu5 libswresample-dev
libswscale-dev libsz2 libtesseract4 libtiff-dev libtiffxx5 liburiparser1 libvtk9.1 libxerces-c3.2
mysql-common proj-data unixodbc-common
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
cudss-cuda-12 cudss0 libcudss0-cuda-12 libcudss0-dev-cuda-12 libcudss0-static-cuda-12
The following NEW packages will be installed:
cudss cudss-cuda-12 cudss0 libcudss0-cuda-12 libcudss0-dev-cuda-12 libcudss0-static-cuda-12
0 upgraded, 6 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/33,2 MB of archives.
After this operation, 81,2 MB of additional disk space will be used.
Get:1 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 libcudss0-cuda-12 0.6.0.5-1 [16,9 MB]
Get:2 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 libcudss0-dev-cuda-12 0.6.0.5-1 [34,0 kB]
Get:3 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 libcudss0-static-cuda-12 0.6.0.5-1 [16,3 MB]
Get:4 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 cudss-cuda-12 0.6.0.5-1 [10,4 kB]
Get:5 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 cudss0 0.6.0-1 [2.410 B]
Get:6 file:/var/cudss-local-tegra-repo-ubuntu2204-0.6.0 cudss 0.6.0-1 [2.406 B]
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package libcudss0-cuda-12.
(Reading database ... 226754 files and directories currently installed.)
Preparing to unpack .../0-libcudss0-cuda-12_0.6.0.5-1_arm64.deb ...
Unpacking libcudss0-cuda-12 (0.6.0.5-1) ...
Selecting previously unselected package libcudss0-dev-cuda-12.
Preparing to unpack .../1-libcudss0-dev-cuda-12_0.6.0.5-1_arm64.deb ...
Unpacking libcudss0-dev-cuda-12 (0.6.0.5-1) ...
Selecting previously unselected package libcudss0-static-cuda-12.
Preparing to unpack .../2-libcudss0-static-cuda-12_0.6.0.5-1_arm64.deb ...
Unpacking libcudss0-static-cuda-12 (0.6.0.5-1) ...
Selecting previously unselected package cudss-cuda-12.
Preparing to unpack .../3-cudss-cuda-12_0.6.0.5-1_arm64.deb ...
Unpacking cudss-cuda-12 (0.6.0.5-1) ...
Selecting previously unselected package cudss0.
Preparing to unpack .../4-cudss0_0.6.0-1_arm64.deb ...
Unpacking cudss0 (0.6.0-1) ...
Selecting previously unselected package cudss.
Preparing to unpack .../5-cudss_0.6.0-1_arm64.deb ...
Unpacking cudss (0.6.0-1) ...
Setting up libcudss0-cuda-12 (0.6.0.5-1) ...
Setting up libcudss0-dev-cuda-12 (0.6.0.5-1) ...
Setting up libcudss0-static-cuda-12 (0.6.0.5-1) ...
Setting up cudss-cuda-12 (0.6.0.5-1) ...
update-alternatives: using /usr/lib/aarch64-linux-gnu/libcudss/12/libcudss.so.0 to provide /usr/lib/aarch64-linux-gnu/libcudss.so.0 (cudss) in auto mode
Setting up cudss0 (0.6.0-1) ...
Setting up cudss (0.6.0-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3.10) ...
johnny@johnny-jetson:~/Projects/jetson-containers$ python3 -c 'import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.backends.cudnn.version()); print(torch.__config__.show());'
2.8.0
True
90300
PyTorch built with:
- GCC 11.4
- C++ Version: 201703
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: DEFAULT
- CUDA Runtime 12.6
- NVCC architecture flags: -gencode;arch=compute_87,code=sm_87
- CuDNN 90.3
- Build settings: BLAS_INFO=open, BUILD_TYPE=Release, COMMIT_SHA=ba56102387ef21a3b04b357e5b183d48f0afefc7, CUDA_VERSION=12.6, CUDNN_VERSION=9.3.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS=-ffunction-sections -fdata-sections -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_PYTORCH_QNNPACK -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=open, TORCH_VERSION=2.8.0, USE_CUDA=ON, USE_CUDNN=1, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=1, USE_MKLDNN=OFF, USE_MPI=0, USE_NCCL=1, USE_NNPACK=1, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF,
johnny@johnny-jetson:~/Projects/jetson-containers$
wget https://developer.download.nvidia.com/compute/cudss/0.6.0/local_installers/cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo dpkg -i cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo cp /var/cudss-local-tegra-repo-ubuntu2204-0.6.0/cudss-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudss
what about cuSPARSELt? should that be installed in new device or the pre-built torch wheels include it? [before building/installing torch]
I have to investigate it but cusparselt seems that is coming in nvidia-toolkit packages.
Sometimes as you mention, wheels are built as static mode, and all that binaries are inside whl.
All our wheels are created by jetson-containers GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T, we are not using cibuildwheel or any similar CI.
HI
Im facing the exact same issues above
I tried the fix and it worked but then when i try to test
ii get the following error
NotImplementedError: Could not run βtorchvision::nmsβ with arguments from the βCUDAβ backend.
maybe your pycharmproject has a different version of torch/torchvision. Please check. If yes , remove them and make the interpreter point to torch/torchvision which was built outside the virtual environment.
Final thoughts: If you canβt point the interpreter to outside built torch, try installing torch&vision inside venv in pycharm terminal. :)
pip3 install --force-reinstall torchvision --index-url https://pypi.jetson-ai-lab.io/jp6/cu126
that means that you installed torchvision that not has cuda
Be careful when you install other packages that can override your previous packages.
I recommend always to do this:
export PIP_INDEX_URL=https://pypi.jetson-ai-lab.io/jp6/cu126
and install then any package, e.g:
pip3 install -e "."
pip3 install transformers
with that, you are telling to PIP that your first URL is pypi from jetson, and we have fallback to pypi.org
i created a new venv
an reainstalled both torch and torchvision as stated above
and i still get the same results
ever since i upgraded from jp 6.1 to 6.2 on the orin nano this has been an ongoing issue
This is all added recently
depends: [cuda, cudnn, nccl, gdrcopy, nvpl, cusparselt, cudss, nvshmem, numpy, onnx]
Why isnβt this set as a new version/patch? It causes inconsistencies for a library version. This has been frustrating. I missed some important deadlines just because of this.
Are these DSS and cusparselt requirements added to Jetson Pytorch installation documentation?
Hi Indra,
Thanks for pointing this out and for sharing your experience, I understand how frustrating that must have been, especially with deadlines involved.
Regarding your questions:
-
PyPI is mainly intended for jetson-containers.
-
If there are missing packages, the end user (or the company) can install them manually or even create custom packages tailored to their specific use case.
-
Weβll review and confirm whether the CuDSS and cusparselt requirements should be added to the Jetson PyTorch installation documentation to prevent inconsistencies in the future.
Appreciate your feedback, it helps us improve.
Now all features are supported in jetson orin finally⦠so more users can use it.
Hi Johnny, Like Sergio I am on 6.2 and I am also struggling to get this version to work.I have had to roll back to Torch 2.7 which works OK. when I install Torch 2.8 I get the Torch not compiled with CUDA error. Thats after clearing all the pip caches, forcing the install from βpip3 install --force-reinstall torchvision --index-url https://pypi.jetson-ai-lab.io/jp6/cu126.β even downloading the Torch wheel and forcing a local install I am still getting the same error. You say that the build is really for use with Jetson Containers but that is really restrictive some of us are trying to innovate and want the software to work natively without any containers.
I hope we can have a solution that works soon because Torch is such an essential part of so much AI software.
Regards,
Hillary
Every version of torch and torchvision I try just gives different errors The legacy wheels I can do enough workarounds to get it to import, but it seems to be very unstable for me. Having to run under python 3.10 is not optimal : \
I have the exact same problem.
johnny@johnny-jetson:~/Projects/jetson-containers$ pip3 install --force-reinstall --no-cache-dir -U torch torchvision --index-url https://pypi.jetson-ai-lab.io/jp6/cu126
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.jetson-ai-lab.io/jp6/cu126
Collecting torch
Downloading https://pypi.jetson-ai-lab.io/jp6/cu126/%2Bf/590/92ab729aee2b8/torch-2.8.0-cp310-cp310-linux_aarch64.whl (227.2 MB)
ββββββββββββββββββββββββββββββββββββββββ 227.2/227.2 MB 17.2 MB/s 0:00:13
Collecting torchvision
Downloading https://pypi.jetson-ai-lab.io/jp6/cu126/%2Bf/1c0/3de08a69e9554/torchvision-0.23.0-cp310-cp310-linux_aarch64.whl (1.1 MB)
ββββββββββββββββββββββββββββββββββββββββ 1.1/1.1 MB 5.5 MB/s 0:00:00
Collecting filelock (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/d38/e30481def2077/filelock-3.19.1-py3-none-any.whl (15 kB)
Collecting typing-extensions>=4.10.0 (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/f0f/a19c6845758ab/typing_extensions-4.15.0-py3-none-any.whl (44 kB)
Collecting sympy>=1.13.3 (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/e09/1cc3e99d2141a/sympy-1.14.0-py3-none-any.whl (6.3 MB)
ββββββββββββββββββββββββββββββββββββββββ 6.3/6.3 MB 97.8 MB/s 0:00:00
Collecting networkx (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/df5/d4365b724cf81/networkx-3.4.2-py3-none-any.whl (1.7 MB)
ββββββββββββββββββββββββββββββββββββββββ 1.7/1.7 MB 380.1 MB/s 0:00:00
Collecting jinja2 (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/85e/ce4451f492d0c/jinja2-3.1.6-py3-none-any.whl (134 kB)
Collecting fsspec (from torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/530/dc2a2af60a414/fsspec-2025.9.0-py3-none-any.whl (199 kB)
Collecting numpy (from torchvision)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/efd/28d4e9cd7d7a8/numpy-2.2.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.3 MB)
ββββββββββββββββββββββββββββββββββββββββ 14.3/14.3 MB 55.6 MB/s 0:00:00
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/f1f/182ebd2303acf/pillow-11.3.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (6.0 MB)
ββββββββββββββββββββββββββββββββββββββββ 6.0/6.0 MB 89.5 MB/s 0:00:00
Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/a0b/2b9fe80bbcd81/mpmath-1.3.0-py3-none-any.whl (536 kB)
ββββββββββββββββββββββββββββββββββββββββ 536.2/536.2 kB 266.1 MB/s 0:00:00
Collecting MarkupSafe>=2.0 (from jinja2->torch)
Downloading https://pypi.jetson-ai-lab.io/root/pypi/%2Bf/38a/9ef736c01fccd/MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (21 kB)
Installing collected packages: mpmath, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch, torchvision
Attempting uninstall: mpmath
Found existing installation: mpmath 1.3.0
Uninstalling mpmath-1.3.0:
Successfully uninstalled mpmath-1.3.0
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.15.0
Uninstalling typing_extensions-4.15.0:
Successfully uninstalled typing_extensions-4.15.0
Attempting uninstall: sympy
Found existing installation: sympy 1.14.0
Uninstalling sympy-1.14.0:
Successfully uninstalled sympy-1.14.0
Attempting uninstall: pillow
Found existing installation: pillow 11.3.0
Uninstalling pillow-11.3.0:
Successfully uninstalled pillow-11.3.0
Attempting uninstall: numpy
Found existing installation: numpy 2.2.6
Uninstalling numpy-2.2.6:
Successfully uninstalled numpy-2.2.6
Attempting uninstall: networkx
Found existing installation: networkx 3.4.2
Uninstalling networkx-3.4.2:
Successfully uninstalled networkx-3.4.2
Attempting uninstall: MarkupSafe
Found existing installation: MarkupSafe 3.0.2
Uninstalling MarkupSafe-3.0.2:
Successfully uninstalled MarkupSafe-3.0.2
Attempting uninstall: fsspec
Found existing installation: fsspec 2025.9.0
Uninstalling fsspec-2025.9.0:
Successfully uninstalled fsspec-2025.9.0
Attempting uninstall: filelock
Found existing installation: filelock 3.19.1
Uninstalling filelock-3.19.1:
Successfully uninstalled filelock-3.19.1
Attempting uninstall: jinja2
Found existing installation: Jinja2 3.1.6
Uninstalling Jinja2-3.1.6:
Successfully uninstalled Jinja2-3.1.6
Attempting uninstall: torch
Found existing installation: torch 2.8.0
Uninstalling torch-2.8.0:
Successfully uninstalled torch-2.8.0
Attempting uninstall: torchvision
Found existing installation: torchvision 0.23.0
Uninstalling torchvision-0.23.0:
Successfully uninstalled torchvision-0.23.0
Successfully installed MarkupSafe-3.0.2 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.4.2 numpy-2.2.6 pillow-11.3.0 sympy-1.14.0 torch-2.8.0 torchvision-0.23.0 typing-extensions-4.15.0
johnny@johnny-jetson:~/Projects/jetson-containers$ python3 packages/pytorch/test.py
testing PyTorch...
PyTorch version: 2.8.0
CUDA available: True
cuDNN version: 90300
PyTorch built with:
- GCC 11.4
- C++ Version: 201703
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: DEFAULT
- CUDA Runtime 12.6
- NVCC architecture flags: -gencode;arch=compute_87,code=sm_87
- CuDNN 90.3
- Build settings: BLAS_INFO=open, BUILD_TYPE=Release, COMMIT_SHA=ba56102387ef21a3b04b357e5b183d48f0afefc7, CUDA_VERSION=12.6, CUDNN_VERSION=9.3.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS=-ffunction-sections -fdata-sections -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_PYTORCH_QNNPACK -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=open, TORCH_VERSION=2.8.0, USE_CUDA=ON, USE_CUDNN=1, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=1, USE_MKLDNN=OFF, USE_MPI=0, USE_NCCL=1, USE_NNPACK=1, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF,
PyTorch 2.8.0
* CUDA device Orin
* CUDA version 12.6
* CUDA cuDNN 90300
* CUDA BLAS _BlasBackend.Cublas
* CUDA linalg _BlasBackend.Cublas
* CUDA flash_attn True
* CUDA flash_sdp True
* CUDA cudnn_sdp True
* CUDA math_sdp True
* CUDA mem_efficient_sdp_enabled True
* CUDA fp16_bf16_reduction_math_sdp False
torch.distributed: True
* NCCL backend is NOT present.
* GLOO backend is NOT present.
* MPI backend is NOT present.
PACKAGING_VERSION=2.8.0
TORCH_CUDA_ARCH_LIST=None
/home/johnny/Projects/jetson-containers/packages/pytorch/test.py:62: UserWarning: The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such as torch.tensor(data, dtype=*, device='cuda') to create tensors. (Triggered internally at /opt/pytorch/torch/csrc/tensor/python_tensor.cpp:78.)
a = torch.cuda.FloatTensor(2).zero_()
Tensor a = tensor([0., 0.], device='cuda:0')
Tensor b = tensor([-0.3596, 0.4977], device='cuda:0')
Tensor c = tensor([-0.3596, 0.4977], device='cuda:0')
testing LAPACK (OpenBLAS)...
done testing LAPACK (OpenBLAS)
testing torch.nn (cuDNN)...
done testing torch.nn (cuDNN)
testing CPU tensor vector operations...
/home/johnny/Projects/jetson-containers/packages/pytorch/test.py:102: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
cpu_y = F.softmax(cpu_x)
Tensor cpu_x = tensor([12.3450])
Tensor softmax = tensor([1.])
Tensor exp (float32) = tensor([[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183]])
Tensor exp (float64) = tensor([[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183]], dtype=torch.float64)
Tensor exp (diff) = 7.429356050359104e-07
PyTorch OK
johnny@johnny-jetson:~/Projects/jetson-containers$ python3 packages/pytorch/torchvision/test.py
testing torchvision...
torchvision version: 0.23.0
testing torchvision extensions...
torchvision classification models: alexnet | convnext_base | convnext_large | convnext_small | convnext_tiny | densenet121 | densenet161 | densenet169 | densenet201 | efficientnet_b0 | efficientnet_b1 | efficientnet_b2 | efficientnet_b3 | efficientnet_b4 | efficientnet_b5 | efficientnet_b6 | efficientnet_b7 | efficientnet_v2_l | efficientnet_v2_m | efficientnet_v2_s | get_model | get_model_builder | get_model_weights | get_weight | googlenet | inception_v3 | list_models | maxvit_t | mnasnet0_5 | mnasnet0_75 | mnasnet1_0 | mnasnet1_3 | mobilenet_v2 | mobilenet_v3_large | mobilenet_v3_small | regnet_x_16gf | regnet_x_1_6gf | regnet_x_32gf | regnet_x_3_2gf | regnet_x_400mf | regnet_x_800mf | regnet_x_8gf | regnet_y_128gf | regnet_y_16gf | regnet_y_1_6gf | regnet_y_32gf | regnet_y_3_2gf | regnet_y_400mf | regnet_y_800mf | regnet_y_8gf | resnet101 | resnet152 | resnet18 | resnet34 | resnet50 | resnext101_32x8d | resnext101_64x4d | resnext50_32x4d | shufflenet_v2_x0_5 | shufflenet_v2_x1_0 | shufflenet_v2_x1_5 | shufflenet_v2_x2_0 | squeezenet1_0 | squeezenet1_1 | swin_b | swin_s | swin_t | swin_v2_b | swin_v2_s | swin_v2_t | vgg11 | vgg11_bn | vgg13 | vgg13_bn | vgg16 | vgg16_bn | vgg19 | vgg19_bn | vit_b_16 | vit_b_32 | vit_h_14 | vit_l_16 | vit_l_32 | wide_resnet101_2 | wide_resnet50_2
Namespace(data_url='https://nvidia.box.com/shared/static/y1ygiahv8h75yiyh0pt50jqdqt7pohgx.gz', data_tar='ILSVRC2012_img_val_subset_5k.tar.gz', models=['resnet18'], resolution=224, workers=2, batch_size=8, print_freq=25, test_threshold=-10.0, use_cuda=True)
using CUDA
dataset classes: 1000
dataset images: 5000
batch size: 8
---------------------------------------------
-- resnet18
---------------------------------------------
loading model 'resnet18'
/home/johnny/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/johnny/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
loaded model 'resnet18'
resnet18 [ 0/625] Time 0.795 ( 0.795) Acc@1 75.00 ( 75.00) Acc@5 100.00 (100.00)
resnet18 [ 25/625] Time 0.031 ( 0.064) Acc@1 62.50 ( 79.81) Acc@5 87.50 ( 94.71)
resnet18 [ 50/625] Time 0.072 ( 0.052) Acc@1 100.00 ( 73.53) Acc@5 100.00 ( 91.91)
resnet18 [ 75/625] Time 0.042 ( 0.049) Acc@1 62.50 ( 76.48) Acc@5 87.50 ( 92.27)
resnet18 [100/625] Time 0.043 ( 0.046) Acc@1 87.50 ( 78.22) Acc@5 100.00 ( 93.07)
resnet18 [125/625] Time 0.009 ( 0.044) Acc@1 87.50 ( 77.38) Acc@5 100.00 ( 93.35)
resnet18 [150/625] Time 0.014 ( 0.043) Acc@1 12.50 ( 76.49) Acc@5 100.00 ( 93.38)
resnet18 [175/625] Time 0.032 ( 0.042) Acc@1 62.50 ( 76.42) Acc@5 100.00 ( 93.75)
resnet18 [200/625] Time 0.065 ( 0.042) Acc@1 100.00 ( 76.55) Acc@5 100.00 ( 93.72)
resnet18 [225/625] Time 0.015 ( 0.042) Acc@1 87.50 ( 76.77) Acc@5 100.00 ( 93.58)
resnet18 [250/625] Time 0.056 ( 0.044) Acc@1 37.50 ( 76.64) Acc@5 62.50 ( 93.63)
resnet18 [275/625] Time 0.010 ( 0.043) Acc@1 87.50 ( 75.72) Acc@5 100.00 ( 93.25)
resnet18 [300/625] Time 0.038 ( 0.043) Acc@1 50.00 ( 74.38) Acc@5 87.50 ( 92.07)
resnet18 [325/625] Time 0.095 ( 0.042) Acc@1 75.00 ( 73.27) Acc@5 100.00 ( 91.53)
resnet18 [350/625] Time 0.065 ( 0.042) Acc@1 62.50 ( 72.69) Acc@5 87.50 ( 91.10)
resnet18 [375/625] Time 0.033 ( 0.042) Acc@1 25.00 ( 72.64) Acc@5 62.50 ( 90.89)
resnet18 [400/625] Time 0.088 ( 0.042) Acc@1 75.00 ( 71.95) Acc@5 87.50 ( 90.27)
resnet18 [425/625] Time 0.014 ( 0.042) Acc@1 62.50 ( 71.33) Acc@5 87.50 ( 89.91)
resnet18 [450/625] Time 0.021 ( 0.041) Acc@1 75.00 ( 71.26) Acc@5 87.50 ( 89.99)
resnet18 [475/625] Time 0.025 ( 0.041) Acc@1 37.50 ( 70.75) Acc@5 75.00 ( 89.63)
resnet18 [500/625] Time 0.022 ( 0.041) Acc@1 75.00 ( 70.33) Acc@5 87.50 ( 89.25)
resnet18 [525/625] Time 0.045 ( 0.041) Acc@1 62.50 ( 69.94) Acc@5 87.50 ( 89.02)
resnet18 [550/625] Time 0.037 ( 0.040) Acc@1 100.00 ( 69.67) Acc@5 100.00 ( 88.77)
resnet18 [575/625] Time 0.045 ( 0.040) Acc@1 75.00 ( 69.31) Acc@5 87.50 ( 88.54)
resnet18 [600/625] Time 0.025 ( 0.040) Acc@1 37.50 ( 69.68) Acc@5 87.50 ( 88.71)
resnet18
* Acc@1 69.740 Expected 69.760 Delta -0.020
* Acc@5 88.760 Expected 89.080 Delta -0.320
* Images/sec 198.624
* PASS
---------------------------------------------
-- Summary
---------------------------------------------
resnet18
* Acc@1 69.740 Expected 69.760 Delta -0.020
* Acc@5 88.760 Expected 89.080 Delta -0.320
* Images/sec 198.624
* PASS
Model tests passing: 1 / 1
torchvision OK
johnny@johnny-jetson:~/Projects/jetson-containers$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0
johnny@johnny-jetson:~/Projects/jetson-containers$ nvidia-smi
Sun Sep 7 00:17:54 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0 Driver Version: 540.4.0 CUDA Version: 12.6 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Orin (nvgpu) N/A | N/A N/A | N/A |
| N/A N/A N/A N/A / N/A | Not Supported | N/A N/A |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
johnny@johnny-jetson:~/Projects/jetson-containers$
Hello, please verify that youβve installed cudss, as @johnny_nv showed above. I can confirm that it works.
wget https://developer.download.nvidia.com/compute/cudss/0.6.0/local_installers/cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo dpkg -i cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
sudo cp /var/cudss-local-tegra-repo-ubuntu2204-0.6.0/cudss-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudss
Output
(test_torch) ubuntu@ubuntu:~$ python3 -c "import torch; print('torch', torch.__version__, 'built for CUDA', torch.version.cuda)"
torch 2.8.0 built for CUDA 12.6
(test_torch) ubuntu@ubuntu:~$ python3
Python 3.10.18 (main, Jun 5 2025, 13:08:10) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
(test_torch) ubuntu@ubuntu:~$ python3 test.py
torch.Size([1, 1000])
(test_torch) ubuntu@ubuntu:~$
(test_torch) ubuntu@ubuntu:~$ pip3 list | grep torch
torch 2.8.0
torchvision 0.23.0

