Hi @AastaLLL
Based on the first message you provided, it seems that l4t-pytorch
requires additional dependencies to be installed. Therefore, I’m currently trying to build it using the command:
PYTORCH_VERSION=2.7 jetson-containers build l4t-pytorch
This builds successfully. However, modifying PYTORCH_VERSION
to another value or leaving it unspecified results in build failure.
According to the second message, PyTorch 2.5 is expected to work normally. But when I try building the image with that version, I noticed the following message during installation:
Collecting torch==2.5
Downloading https://pypi.jetson-ai-lab.dev/jp6/cu126/%2Bf/5cf/9ed17e35cb752/torch-2.5.0-cp310-cp310-linux_aarch64.whl (230.6 MB)
This clearly shows that PyTorch 2.5 is being downloaded from the server. However, in the subsequent steps, it appears to fall back to building from source. Here is part of the output:
PyTorch version: 2.5.0
CUDA available: True
Traceback (most recent call last):
...
RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (9, 4, 0) but found runtime version (9, 3, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN. Looks like your LD_LIBRARY_PATH contains incompatible version of cudnn. Please either remove it from the path or install cudnn (9, 4, 0)
+ echo 'Building PyTorch 2.5.0'
+ git clone --branch v2.5.0 --depth=1 --recursive https://github.com/pytorch/pytorch /opt/pytorch
This part might be where the issue lies. You might be able to reproduce the same behavior on your end by running the same command:
PYTORCH_VERSION=2.5 jetson-containers build l4t-pytorch
I took a quick look at the source code and here’s my assumption:
In version.py
, I noticed the following logic:
elif CUDA_VERSION == Version('12.6'): # JetPack 6.2 (CUDA 12.6)
PYTORCH_VERSION = Version('2.6')
So, when no specific version is provided, it defaults to PyTorch 2.6. This explains why omitting the version (e.g., not specifying 2.7) causes a failure.
Also, according to the Dockerfile
, the build instruction:
RUN /tmp/pytorch/install.sh || /tmp/pytorch/build.sh
will attempt to install PyTorch first, and if that fails, it will proceed to build from source.
It might be that the installation step fails due to a mismatch between the expected and actual cuDNN versions.