CUDA Error when starting machine post suspension

pratikkgandhi · April 14, 2021, 1:07pm

Hello,
I am getting the below error when I suspend my Ubuntu machine and restart it. I am currently using 20.04 and my Graphics card is GEFORCE RTX 3080.

File “rmm/_cuda/gpu.pyx”, line 134, in rmm._cuda.gpu.getDeviceCount
rmm._cuda.gpu.CUDARuntimeError: cudaErrorUnknown: unknown error

Any help would be appreciated. Thank you!

generix · April 14, 2021, 5:13pm

I guess you either have to set up video object persistence https://download.nvidia.com/XFree86/Linux-x86_64/440.64/README/powermanagement.html
or unload and reload the nvidia-uvm module on resume.

pratikkgandhi · April 14, 2021, 6:07pm

Thank you @generix for the response. I did try the second option. However, when I run sudo rmmod nvidia_uvm it gives rmmod: ERROR: Module nvidia_uvm is in use and using force sudo rmmod -f nvidia_uvm it gives rmmod: ERROR: …/libkmod/libkmod-module.c:799 kmod_module_remove_module() could not remove ‘nvidia_uvm’: Resource temporarily unavailable
rmmod: ERROR: could not remove module nvidia_uvm: Resource temporarily unavailable.

Any suggestions on this?

generix · April 14, 2021, 6:13pm

Sounds like you had a cuda job or an application using cuda on suspend, you will have to kill that in order to be able to unload the uvm module. otherwise, you’ll have to try using option 1.

pratikkgandhi · April 14, 2021, 6:31pm

Essentially I had to close everything (including my browser - I didn’t know it was using CUDA) and than was able to run those commands of option 1. I think CUDA came back up after that. Any recommendations you have to automate it once I log into the computer after the suspension or if any other option than to suspend. Thanks @generix .

generix · April 14, 2021, 6:48pm

Yeah, didn’t think of hw accel for video decode in chrome/electron apps. Those likely hold on to the uvm module as well.
Latest drivers have part of the pm methods set as default, maybe try adding the graphics drivers ppa to get the latest driver and check if that works better with cuda on suspend.

pratikkgandhi · April 15, 2021, 12:23am

Is this a good reference to install - How to Use Ubuntu Nvidia PPA ?
Thanks!

generix · April 16, 2021, 11:03am

Yes, explained so even beginners should get it right.

Topic		Replies	Views
cuda (375.66) is failing with uknown error 30 after suspending Ubuntu 16.04 Linux	3	1674	September 5, 2017
Cuda error 30 (unknown error) after suspend CUDA Programming and Performance	11	19981	June 19, 2019
BUG: nvidia_uvm needs to be removed and re-inserted in order to work after wakeup from suspend Linux driver	22	6793	November 27, 2024
problem installation cuda CUDA Setup and Installation	1	3213	June 12, 2014
Cuda and Nvidia drivers failing to install on ubuntu CUDA Setup and Installation	8	7903	September 11, 2019
Terminal freezing after performing any CUDA related operation after updating the recent drivers CUDA Setup and Installation	3	2266	February 25, 2016
can't install new driver, cannot unload module CUDA Setup and Installation	1	2400	November 4, 2018
Ubuntu 20.04 installing CUDA changes nvidia drivers and breaks suspend CUDA Setup and Installation cuda , ubuntu	2	5242	December 9, 2021
nvidia-persistenced failed to initialize. Check syslog for more details. CUDA Setup and Installation	9	34053	December 29, 2024
Ubuntu20.04 + nvidia-driver-470 - card drops out after minimal use /delay after first cuda command but works w/nvidia-driver-460 CUDA Setup and Installation	3	1342	November 24, 2021

CUDA Error when starting machine post suspension

Related topics