Jetson Orin Nano Compatibility with NCCL for PyTorch

Hi,

I’ve been working with a Jetson Orin Nano and recently installed PyTorch v1.11.0. While testing the distributed capabilities, I noticed that torch.distributed.is_available() returns True and torch.distributed.is_nccl_available() returns False. This led me to wonder if Jetson Orin Nano supports NCCL for PyTorch.

My attempts to build PyTorch from source using the command python setup.py bdist_wheel have resulted in system hangs and compilation termination errors during the rebuild process.

I’m facing difficulties resolving this issue and would appreciate guidance on how to enable NCCL support for PyTorch on my Jetson Orin Nano. Any advice or insights from the community would be greatly appreciated.

Thanks

Moving this topic to the Jetson Orin nano category for visibility.

Hi @m.tharani.193, Jetson is single-GPU architecture and doesn’t support NCCL:

You can recompile PyTorch with USE_DISTRIBUTED enabled (try mounting additional swap if it hangs), but it will use MPI instead of NCCL.