The only torch package that seems to be compatible with JetPack 6.1 on the jetson orin nano hardware is:
torch-2.5.0a0+872d972e41.nv24.08.17622132-cp310-cp310-linux_aarch64.whl
Unfortunately, it was not built with USE_DISTRIBUTED=1, so I am stuck trying to run things that depend on torch.distributed.is_available().
Is there an alternative pre-build package available that would get me around having to recompile from scratch? My jetson system is close to out of storage, and I would really like to avoid having to get the compile environment set up properly. Thanks!
I understand the challenge you’re facing with JetPack 6.1 on the Jetson Orin Nano. It’s not uncommon to run into compatibility issues with distributed features like torch.distributed
on embedded systems, especially when pre-built packages don’t include the necessary flags or features.
To answer your question:
- Alternative Pre-built Packages: Currently, NVIDIA’s Jetson-specific Torch builds (like the one you’re using) are often tailored to the platform and might not include the
USE_DISTRIBUTED=1
configuration, given the constraints of the embedded systems. However, there might be a few avenues to explore:
- NVIDIA PyTorch Containers: While this won’t directly address the specific build issue, you might want to check out NVIDIA’s pre-built PyTorch containers for Jetson. These are fully optimized for Jetson hardware and might have better support for distributed features out-of-the-box, though they may require Docker.
- Third-Party Wheels: It’s possible that some other repositories (such as the ones from community members working with Jetson) have built versions of PyTorch with
torch.distributed
enabled. You could try searching places like JetsonHacks or other GitHub repositories that specialize in pre-built wheels for Jetson systems.
- Minimizing Storage Overhead: Given that your system is running low on storage, I understand the reluctance to recompile from scratch. One option is to look for smaller pre-built packages with only the necessary components of PyTorch, possibly stripping down non-essential features. You might also explore symlink-based approaches to avoid using up too much local disk space if you have external storage options available.
- Recompiling Considerations: If you do end up having to recompile, I’d recommend ensuring that the build environment is as lean as possible (e.g., by using an external disk for build dependencies or utilizing an SSD if available). Additionally, NVIDIA’s Jetson Optimization Guide has some helpful tips for optimizing PyTorch builds on Jetson devices to reduce build time and storage usage.
In summary, while I don’t have a direct alternative package off the top of my head that meets your exact needs (with USE_DISTRIBUTED=1
enabled), you may want to keep exploring third-party wheels or containerized environments like Docker. If that’s not viable, compiling might still be the best option to fully leverage distributed features, but I completely understand the desire to avoid it.
Let me know if you need further insights.
Thank you Julian for your informative reply! I have managed to use the Jetson Containers docker orchestration system to build a torch image, and have extracted the resulting filesystem. I have seen comments that USE_DISTRIBUTED=1 is the default for Linux builds, so I am hopeful that the resulting torch package has this feature enabled, although I am unsure how to tell if this is actually the case. My main questions now are:
- Did this build enable torch.distributed as desired (I have seen comments that USE_DISTRIBUTED=1 is the default for Linux builds)?
- How can I build an installable whl file from the build results?
- The resulting build appears to have a complex directory structure (shown below), which I don’t fully understand. Any pointers welcome!
I have never built a python whl file from scratch before, but apparently there is a lot of metadata that needs to be specified in either a setup.py file or using in whatever poetry requires (also never used before).
Is there any hope of finding documentation on how to extract and package the build result correctly?
Hi,
Please try the package from below link instead:
https://pypi.jetson-ai-lab.dev/jp6/cu126/torch/2.5.0
$ python3
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.5.0'
>>> torch.distributed.is_available()
True
Thanks.
Nice - that may be cleaner than what I currently have. Good find!