Issue:
I installed torch-2.5.0a0+872d972e41.nv24.8 and jetpack 6.1 in Agx orin,
found the torch lost module ‘torch.distributed’, the error is module ‘torch.distributed’ has no attribute ‘init_process_group’ as run following command in python
~$ python
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
import torch
print(torch.version)
2.5.0a0+872d972e41.nv24.08
import torch.distributed as dist
dist.init_process_group(backend=“nccl”)
Traceback (most recent call last):
File “”, line 1, in AttributeError: module ‘torch.distributed’ has no attribute ‘init_process_group’
Environment:
device :Agx orin
torch verison : torch-2.5.0a0+872d972e41.nv24.08.17622132-cp310-cp310-linux_aarch64.whl download link selected from PyTorch for JetPack 6.1 as following link: Jetson Download Center | NVIDIA Developer
jatpack: 6.1
python:3.10.12
Question:
I tried to rebuild the pytorch 2.5.0 but failed, how to get full version of pytorch 2.5.0 using jetpack6.1 ?
I met a new issue as installed you mentioned new torch2.5.0, please refer to following error message:
import torch
import torch.distributed as dist
dist.init_process_group(backend=“nccl”)
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.10/dist-packages/torch/distributed/c10d_logger.py”, line 83, in wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/distributed/c10d_logger.py”, line 97, in wrapper
func_return = func(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py”, line 1520, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File “/usr/local/lib/python3.10/dist-packages/torch/distributed/rendezvous.py”, line 258, in _env_rendezvous_handler
rank = int(_get_env_or_raise(“RANK”))
File “/usr/local/lib/python3.10/dist-packages/torch/distributed/rendezvous.py”, line 243, in _get_env_or_raise
raise _env_error(env_var) ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
I tried in 3 environment installing new torch2.5.0, issues are same in Agx orin device , virtual environment and docker.