Blackwell pro failing CUDA simpleMultiGPU sample

adax · May 18, 2025, 3:35pm

That’s actually a really good idea - Treating the same node as two nodes through a distributed job is a fine approach. I just tried it but it fails with the same error, but gave another data point:

Once the first process is using GPUs 0,1,2,3, CUDA errors out when trying to initialize the remaining 4,5,6,7. So something that is cross-process is blocking initialization of more than 4 at a time

example: run 2 different processes:
CUDA_VISIBLE_DEVICES=0,1,2,3 python -c ‘import torch; import time; torch.cuda.is_available(); time.sleep(100)’
CUDA_VISIBLE_DEVICES=4,5,6,7 python -c ‘import torch; import time; torch.cuda.is_available(); time.sleep(100)’

If you specify the same GPUs in both processes, it will succeed. However if you specify different GPUs (>4) across both processes, whatever is launched second will fail

Topic		Replies	Views
Enabling multi-GPU CUDA Setup and Installation	1	109	March 18, 2026
CUDA device not found while using blackwell CUDA Setup and Installation	1	111	February 11, 2026
Cuda 12.8 Diffrent Multi GPU Support Not Working CUDA Setup and Installation llama	3	494	May 30, 2025
Quad (4x) A6000 WSL2 CUDA Init Errors CUDA on Windows Subsystem for Linux	11	3404	November 29, 2024
RTX PRO 6000 Blackwell Workstation Edition - Driver Support Linux cuda , ubuntu	8	7003	December 6, 2025
Is there a limit on the maximum number of GPUs? CUDA Programming and Performance cuda	5	1923	September 17, 2021
Strange problem with CUDA on 3 GPUS (5090, 6000 Ada, RTX8000) CUDA Setup and Installation cuda , pytorch	2	977	June 23, 2025
Not able to initialize all GPU cards in Ubuntu 12.04 CUDA Setup and Installation	9	9107	June 8, 2014
CUDA can't initialize after upgrade CUDA Setup and Installation	2	383	May 19, 2025
Issue with more than 8 GPUs CUDA Setup and Installation	8	3844	November 25, 2013

Blackwell pro failing CUDA simpleMultiGPU sample

Related topics