Torch crashes driver on H100

swagginty · June 27, 2025, 6:50am

CUDA Version: 12.9
torch Version: 2.7.1+cu128
I am using a single H100 PCIe on paperspace

>>> torch.cuda.is_available()
True

this shows cuda is available.
However when I run

x = torch.randn(1, 3, 224, 224, device="cuda")

torch tries to initialize cuda and i get the following error.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tyrin/Desktop/CCAMSync/.pixi/envs/flash/lib/python3.10/site-packages/torch/cuda/__init__.py", line 372, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized
>>> torch.cuda.is_available()

I’ve looked at other posts and seen this error is usually related to fabricmanager, however I am using single-GPU system.

rs277 · June 27, 2025, 7:53am

Do you get the same error with Cuda 12.8, which seems to be the version your torch is based on?

Topic		Replies	Views
Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Erro CUDA Setup and Installation cuda	9	19691	March 25, 2024
Nvidia fabric manger initializing CUDA H100 Drivers - Linux, Windows, MacOS cuda , nvbugs , python	1	634	July 4, 2024
How to solve this problem? CUDA Setup and Installation cuda	0	305	February 18, 2025
Error with B200 cuda setup with torch.cuda cannot load CUDA Setup and Installation	1	387	July 16, 2025
Cuda is not available TensorRT cuda	1	390	May 31, 2024
Torch.cuda.is_available() return "False" and other error message CUDA Setup and Installation pytorch	0	787	March 13, 2024
Error running cuda on VM with GPU passthrough. cuda.get_device_name() returns 802, not initialized CUDA Setup and Installation	6	1058	January 12, 2026
Not able to run AI workloads on H100 GPU NVIDIA Nemotron tensorrt , cuda , tensorflow , kernel , ubuntu , cudnn , rapids	6	1378	December 28, 2024
Cuda 12.4 Driver Version: 565.57.0 CUDA Setup and Installation	1	715	December 19, 2024
Assertion error: torch not compiled with CUDA enabled GPU - Hardware cuda , pytorch	0	2283	February 27, 2023

Torch crashes driver on H100

Related topics