Pytorch does not support NCCL


2 *Jetson AGX Orin 64GB


Jetpack   5.1.1
Python    3.8.10
NCCL      2.11.4+cuda11.4
Pytorch   v1.11.0

The pytorch i used is provided by NVIDIA;
PyTorch for Jetson
I try to build a distributed development environment based on AGX Orin, and communicate using nccl.
I’ve tried version 2.1 of pytorch in the past, But it doesn’t seem to provide a distribution module.

#pytorch v2.1.0 
>>> import pytorch
>>> torch.distributed.is_available( )

Then i switched the version to v1.11.0, but i met the following problem:

#pytorch v1.11.0 
>>> import pytorch
>>> torch.distributed.is_available( )
>>> torch.distributed.is_nccl_available()
>>> torch.cuda.nccl.is_available(torch.randn(1).cuda())
/usr/local/lib/python3.8/dist-packages/torch/cuda/ UserWarning: PyTorch is not compiled with NCCL support
  warnings.warn('PyTorch is not compiled with NCCL support')

I want to know dose the orin support NCCL? And how to solve the problem of use NCCL?Thanks!

Hi @whoops, NCCL is not supported on the Jetson platform. You can built PyTorch with USE_DISTRIBUTED enabled, and it will use MPI instead of NCCL.

thank you for quick reply @dusty_nv .I want to know if Jetson platform will support NCCL in the future?

