Volatile GPU-Util Err

I am showing an error with nvidia-smi.exe but have no idea of the purpose. UNder ‘Volatile GPU-Util’ I just have ‘ERR!’ This may be nothing or it may explain other issues I have. Not sure if this is related to the insufficient permissions problem below or not.

My Machine is Windows 7 (64bit) and NVIDIA version is 7.5. Card is Quadro 4000.

±-----------------------------------------------------+
| NVIDIA-SMI 353.90 Driver Version: 353.90 |
|-------------------------------±---------------------±---------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro 4000 WDDM | 0000:03:00.0 On | N/A |
| 36% 80C P12 N/A / N/A | 86MiB / 2048MiB | ERR! Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 4 C Insufficient Permissions N/A |
±----------------------------------------------------------------------------+

I am facing the same problem.

I have Cuda 7.5 installed on my server with Windows Server 2012 R2, and Nvidia Quadro K6000 GPU installed. It’s a 64 bit machine.

I can import theano, and nvidia-smi is showing GPU status.

But in nvidia-smi, Volatile GPU-Util shows ERR!

+------------------------------------------------------+
    | NVIDIA-SMI 353.90     Driver Version: 353.90         |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  Quadro K6000       WDDM  | 0000:08:00.0     Off |                   40 |
    | 26%   36C    P8    21W / 225W |    357MiB / 11520MiB |    ERR!      Default |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID  Type  Process name                               Usage      |
    |=============================================================================|
    |    0        32  C+G   ...x86)\Google\Chrome\Application\chrome.exe N/A      |
    |    0       476  C+G   C:\Windows\Explorer.EXE                      N/A      |
    |    0       620  C+G   Insufficient Permissions                     N/A      |
    |    0      2924  C+G   Insufficient Permissions                     N/A      |
    |    0      3040  C+G   Insufficient Permissions                     N/A      |
    |    0      3568  C+G   Insufficient Permissions                     N/A      |
    |    0      3804  C+G   Insufficient Permissions                     N/A      |
    |    0      3828  C+G   Insufficient Permissions                     N/A      |
    |    0      5036  C+G   Insufficient Permissions                     N/A      |
    +-----------------------------------------------------------------------------+

Please explain me why this is happening, and how to fix it.

Thanks in advance.

The uncorrected ECC error column showing 40 could be a sign that the on-board Quadro K6000 GPU memory is starting to fail. Try powering the system off and rebooting, and see if the utilization column propagates correctly. Also check that the PCI-E power connections to the card are all populated correctly and that the card is firmly seated in the motherboard you are using. If you have a spare GPU, place it in the same slot and see if the behavior follows the card or the motherboard.