Hi there, looking for some help to get my pytorch running again.
Context
I first checked my installed version using nvcc
gibler@gibler-MS-7D43:/usr/lib/x86_64-linux-gnu$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
I then downloaded cuDNN 8.8 for 11.x and installed it.
This is the output I’m getting when I attempt to start Stable Diffusion now.
Traceback (most recent call last):
File "/home/myuser/stable-diffusion-webui/launch.py", line 360, in <module>
prepare_environment()
File "/home/myuser/stable-diffusion-webui/launch.py", line 272, in prepare_environment
run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")
File "/home/myuser/stable-diffusion-webui/launch.py", line 129, in run_python
return run(f'"{python}" -c "{code}"', desc, errdesc)
File "/home/myuser/stable-diffusion-webui/launch.py", line 105, in run
raise RuntimeError(message)
RuntimeError: Error running command.
Command: "/home/myuser/stable-diffusion-webui/venv/bin/python3" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'"
Error code: 1
stdout: <empty>
stderr: /home/myuser/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
File "<string>", line 1, in <module>
AssertionError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
How can I diagnose the core issue and fix this? The error doesn’t really lead me in a way I understand to follow.