2 *Jetson AGX Orin 64GB
Jetpack 5.1.1 Python 3.8.10 NCCL 2.11.4+cuda11.4 Pytorch v1.11.0
The pytorch i used is provided by NVIDIA；
PyTorch for Jetson
I try to build a distributed development environment based on AGX Orin, and communicate using nccl.
I’ve tried version 2.1 of pytorch in the past, But it doesn’t seem to provide a distribution module.
#pytorch v2.1.0 >>> import pytorch >>> torch.distributed.is_available( ) False
Then i switched the version to v1.11.0, but i met the following problem:
#pytorch v1.11.0 >>> import pytorch >>> torch.distributed.is_available( ) True >>> torch.distributed.is_nccl_available() False >>> torch.cuda.nccl.is_available(torch.randn(1).cuda()) /usr/local/lib/python3.8/dist-packages/torch/cuda/nccl.py:15: UserWarning: PyTorch is not compiled with NCCL support warnings.warn('PyTorch is not compiled with NCCL support') False
I want to know dose the orin support NCCL? And how to solve the problem of use NCCL?Thanks!