I want to deploy distributed ai model on several Jetson Orin and I find that the latest torch provided in community is not compiled with distributed option. Are there any other options to do that except for compiling torch manually?
*** Please note that this reply is generated by LLM automatically ***
To deploy a distributed AI model on multiple Jetson Orin devices, you’re looking for alternatives to compiling PyTorch manually with the distributed option. While compiling PyTorch from source is a viable option, it can be time-consuming and requires significant expertise.
Fortunately, there are other options you can explore:
- Use a pre-compiled PyTorch wheel with distributed support: Although the latest PyTorch wheel provided by the community might not have distributed support, you can try searching for older versions or alternative sources that might have the feature you need. You can check the PyTorch GitHub repository or other third-party repositories for pre-compiled wheels with distributed support.
- Utilize NVIDIA’s NGC containers: NVIDIA provides pre-built containers for various deep learning frameworks, including PyTorch, through their NVIDIA GPU Cloud (NGC) platform. These containers often include distributed support and are optimized for NVIDIA GPUs, including the Jetson Orin. You can pull the PyTorch container from NGC and use it as a base for your distributed AI model deployment.
- Leverage NVIDIA’s Deep Learning Frameworks (DLF) containers: Similar to NGC containers, DLF containers provide pre-built environments for popular deep learning frameworks, including PyTorch. These containers are designed to work seamlessly with NVIDIA GPUs and might include distributed support.
- Use a different distributed deep learning framework: If PyTorch is not a hard requirement, you can explore other distributed deep learning frameworks like TensorFlow, MXNet, or Horovod. These frameworks might have better support for distributed training and deployment on Jetson Orin devices.
- Compile PyTorch manually (as a last resort): If none of the above options work for you, compiling PyTorch manually with distributed support might be the only way to go. This will require significant expertise and time, but it will give you the flexibility to customize PyTorch to your specific needs.
Before choosing an option, consider the following factors:
- Compatibility: Ensure that the chosen solution is compatible with your Jetson Orin devices and the specific AI model you want to deploy.
- Performance: Evaluate the performance of each option to ensure it meets your requirements.
- Ease of use: Consider the ease of use and setup for each option, as some might require more expertise or configuration than others.
- Support: Check the level of support and community involvement for each option, as this can impact the availability of resources and troubleshooting help.
By exploring these alternatives, you should be able to find a suitable solution for deploying your distributed AI model on multiple Jetson Orin devices without having to compile PyTorch manually.
*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***
Hi,
What is your JP version?
We have uploaded the legacy wheels for r36.x
Please refer
Thanks
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.