Hi,
torch.distributed uses MPI as the backend. Based on the error, have you built OpenMPI with CUDA support?
Thanks.