cuda (375.66) is failing with uknown error 30 after suspending Ubuntu 16.04

cypreess · June 9, 2017, 6:51pm

Hi,

I have GTX 1080ti on Ubuntu 16.04

Driver: 375.66

CUDA 8.0-61.1
libcuda 375.66
cuda-drivers 375.51-1

Problem:

After suspending PC cuda lib is not working anymore:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: Unable to get the number of gpus available: unknown error)

However at the same time nvidia-smi works absolutetly fine

cypreess@gtx:~/dev/NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release$ nvidia-smi 
Fri Jun  9 20:49:59 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 0000:01:00.0      On |                  N/A |
| 23%   37C    P8    18W / 250W |    727MiB / 11169MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1261    G   /usr/lib/xorg/Xorg                              25MiB |
|    0      1463    G   /usr/lib/xorg/Xorg                             329MiB |
|    0      1801    G   /usr/bin/gnome-shell                           182MiB |
|    0      2526    G   /proc/self/exe                                  27MiB |
|    0      5243    G   ...el-token=682029E6D17C8080D4B5A7BE0DA20F10   130MiB |
+-----------------------------------------------------------------------------+

I was trying to change mode using:

/usr/bin/nvidia-smi -pm ENABLED
/usr/bin/nvidia-smi -c EXCLUSIVE_PROCESS

without any luck.

Only PC restart fixes the problem.

nvidia-bug-report.log.gz (168 KB)

cypreess · June 9, 2017, 7:11pm

Update:

I just found a workaround to avoid PC restart:

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm

markolya · September 1, 2017, 8:40pm

Can Nvidia please provide a response to that problem?

I can confirm that it also happens on Quadro M2000M.

It is still happening on 384.69 drivers and it seems quite ridiculous to have to unload/load modules after suspend/resume cycle in 2017.

Thanks!

markolya · September 5, 2017, 4:58pm

I would like to add more information to this problem:

Enabling persistence mode in persistenced actually makes box lockup on awake - no pings, nothings moves, total freeze.
With persistence mode disabled dmesg shows following on awake:

[ 319.338880] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000030, engmask 00000101, intr 10000000
[ 319.342072] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000003, engmask 00000104, intr 10000000

Is there any way to fix this problem?

Topic		Replies	Views
Cuda error 30 (unknown error) after suspend CUDA Programming and Performance	11	19983	June 19, 2019
CUDA driver version is insufficient for CUDA runtime version CUDA Setup and Installation	1	1346	January 25, 2018
Linux installation error: cudaGetDeviceCount returned 30 -> unknown error CUDA Setup and Installation	9	19535	November 4, 2021
Sample devieQuery cuda program error in Cuda 10.0 and Centos 7 CUDA Setup and Installation	2	946	April 1, 2019
Cuda Installation : Running deviceQuery - Unknown Error Linux	0	1126	November 16, 2016
CUDA driver version is insufficient for CUDA runtime version CUDA Setup and Installation	2	1208	May 3, 2017
Buying Nvidia Products is a Serious Waste of Money: They Don't Work CUDA Developer Tools	0	439	June 26, 2020
CUDA NOT WORKING CUDA Setup and Installation	1	46	March 13, 2025
Cuda Error Insuficient Driver (Error 35) CUDA Setup and Installation	6	4594	March 5, 2019
CUDA Error when starting machine post suspension Linux	7	3453	April 16, 2021

cuda (375.66) is failing with uknown error 30 after suspending Ubuntu 16.04

Related topics