Effective PyTorch and CUDA

I attached the docker file, I am pulling the latest nvidia torch but i am still getting issues with sm_121

FROM nvcr.io/nvidia/pytorch:25.12-py3

Just to be explicitRUN pip install --upgrade pip

Install your Python deps

RUN pip install 
torchao 
torchtune 
datasets 
bitsandbytes 
tensorboard 
matplotlib 
mlflow 
python-dotenv 
loguru 
nvidia-ml-py

Default workdir inside container (will be overridden by -w at runtime if you want)

WORKDIR /workspacePython 3.12.3

Pytorch Installed Version: 2.10.0a0+b4e4ee81d3.nv25.12
Pytorch CUDA Version: 13.1
Pytorch CUDA is available: True
Pytorch CUDA device count: 1
Pytorch NCCL Version: (2, 28, 9)

Traceback (most recent call last):
File “/workspace/torch_tune/src/full_test_dist.py”, line 55, in 
raise RuntimeError(
RuntimeError: Unsupported GPU for this PyTorch build. Detected device arch sm_121, but this PyTorch install only supports: compute_120, sm_100, sm_110, sm_120, sm_80, sm_86, sm_90. Install a PyTorch build that includes your GPU architecture (or use a different GPU).
E0112 14:33:34.279000 211 torch/distributed/elastic/multiprocessing/api.py:978] failed (exitcode: 1) local_rank: 0 (pid: 236) of binary: /usr/bin/python
Traceback (most recent call last):
File “/usr/local/bin/torchrun”, line 7, in 
sys.exit(main())
^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py”, line 362, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py”, line 982, in main
run(args)
File “/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py”, line 973, in run
elastic_launch(
File “/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py”, line 165, in call
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py”, line 313, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
1 Like