Problem with bare metal install for 22.09 in WSL: CUDA graphs may only be used in Pytorch built with CUDA >= 11

tsltaywb · September 28, 2022, 3:34am

Hi,

I have problems with docker in WSL for 22.09. Hence, I tried to do a bare metal install. I tried to run the helmholtz example. It ran for a few steps but gave the error:

[11:12:04] - JitManager: {'_enabled': False, '_arch_mode': <JitArchMode.ONLY_ACTIVATION: 1>, '_use_nvfuser': True, '_autograd_nodes': False}
[11:12:04] - GraphManager: {'_func_arch': False, '_debug': False, '_func_arch_allow_partial_hessian': True}
[11:12:09] - attempting to restore from: outputs/helmholtz
[11:12:09] - Success loading optimizer: outputs/helmholtz/optim_checkpoint.0.pth
[11:12:09] - Success loading model: outputs/helmholtz/wave_network.0.pth
[11:12:10] - [step:          0] record constraint batch time:  9.617e-02s
[11:12:11] - [step:          0] record validators time:  1.156e+00s
[11:12:11] - [step:          0] saved checkpoint to outputs/helmholtz
[11:12:11] - [step:          0] loss:  9.899e+03
[11:12:13] - Attempting cuda graph building, this may take a bit...
Error executing job with overrides: []
Traceback (most recent call last):
  File "helmholtz.py", line 92, in run
    slv.solve()
  File "/home/user/modulus_22.09/lib/python3.8/site-packages/modulus-22.9-py3.8.egg/modulus/solver/solver.py", line 159, in solve
    self._train_loop(sigterm_handler)
  File "/home/user/modulus_22.09/lib/python3.8/site-packages/modulus-22.9-py3.8.egg/modulus/trainer.py", line 521, in _train_loop
    loss, losses = self._cuda_graph_training_step(step)
  File "/home/user/modulus_22.09/lib/python3.8/site-packages/modulus-22.9-py3.8.egg/modulus/trainer.py", line 724, in _cuda_graph_training_step
    self.g = torch.cuda.CUDAGraph()
  File "/home/user/modulus_22.09/lib/python3.8/site-packages/torch/cuda/graphs.py", line 50, in __init__
    super(CUDAGraph, self).__init__()
RuntimeError: CUDA graphs may only be used in Pytorch built with CUDA >= 11.0 and not yet supported on ROCM

So how can I solve this error?

Thanks.

ngeneva · September 29, 2022, 3:40pm

Hi @tsltaywb

Try turning cuda graphs off in the config.yaml file by adding: cuda_graphs: False.

Have a look at helmholtz/conf/config_hardBC.yaml as an example with this setting. This will disable cuda graph compiling. Keep in mind Cuda graphs is a beta feature in PyTorch so support may be limited like seems to be the case here.

tom_02 · October 5, 2022, 10:19pm

Hi,

I have the same problem as @tsltaywb but, in my case the file config_hardBC.yaml was already modified with the solution you expressed. The problem persists even if i change cuda_graphs: True , so it looks like this is not the source of the problem.

I have CUDA 11.3 and the correct version of pytorch (installed from this link with conda).

How can i solve?

Thanks

ngeneva · October 5, 2022, 11:26pm

Hi @tom_02

Its not 100% clear what your problem is. The original post is specifically regarding a Cuda graphs error which should not appear with cuda graphs off.

To shut if off, you need to edit the config for the example you are running (for example, for this issue in the original post this needs to be added to examples/helmholtz/conf/config.yaml). The hardBC config is for helmholtz_hardBC.py.

Let me know if helmholtz.py does not run after modifying the correct config file.

tom_02 · October 6, 2022, 8:09am

Thanks, this worked for me.

Topic		Replies	Views
Pytorch throws CUDA runtime error on WSL2 CUDA on Windows Subsystem for Linux pytorch	1	1978	January 4, 2023
Windows 11 WSL2 CUDA (Windows 11 Home 22000.708, Nvidia Studio Driver 512.96) CUDA on Windows Subsystem for Linux	16	42157	October 2, 2022
No GPUs are available in WSL2+Pytorch on Windows 10 (21H2) CUDA on Windows Subsystem for Linux cuda , pytorch , wsl	1	2818	February 4, 2022
CUDA does not work in WSL (Windows 11) environment despite CUDA, TensorRT and Pytorch are all installed and properly configured CUDA on Windows Subsystem for Linux	0	429	June 23, 2024
CUDA driver version is insufficient for CUDA runtime version [Ubuntu 22.04, WSL2] CUDA Setup and Installation cuda , ubuntu , wsl	1	2133	November 3, 2022
Assertion error when restarting training Technical Support (PhysicsNeMo Only)	4	689	August 12, 2022
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver CUDA on Windows Subsystem for Linux	33	23510	May 1, 2021
Installation on WSL2/Windows 11 problem - can't see GPU CUDA on Windows Subsystem for Linux	11	22911	January 15, 2025
Thank You NVIDIA - Everything is working fine on wsl2 and windows 10 CUDA on Windows Subsystem for Linux	6	4793	May 31, 2021
Can't run cuda with docker on WSL CUDA on Windows Subsystem for Linux	2	1302	June 7, 2022

Problem with bare metal install for 22.09 in WSL: CUDA graphs may only be used in Pytorch built with CUDA >= 11

Related topics