Failed to initialize NVML: Unknown Error when running nvidia-smi Anyone else have this error?

Since I upgraded to the CUDA 4.0 development drivers, I started getting
“Failed to initialize NVML: Unknown Error” whenever I run nvidia-smi. Is anyone else getting this?

I haven’t had that one, but nvidia-smi from the 4.0rc dev drivers is basically 100% broken on consumer GPUs at the moment. Tim Murray said it will be fixed in the next 4.0 release.

Thanks. Sounds like I missed the memo, then.

I get the same error with Cuda 3.2 on GTX 580. Only solution is to reboot the machine.

I have the same problem on a system with two NVIDA Quadro 6000 GPUs:
NVIDIA: could not open the device file /dev/nvidia1 (Input/output error).
Failed to initialize NVML: Unknown Error

The weird thing is that we have several identical machines and the errors occurs only on a couple of them, the others run just fine (same driver, same toolkit).

Driver: 275.09.07
Cuda Toolkit: 4.0

Cheers, Sandra

CUDA 4.0 final solved this issue for me with GTX 480. But when I added a GTX 580 to the system, the problem reappeared. I have a ticket open with NVIDIA.

Could you let us know as soon as you get feedback (any kind)? Thanks!

Do you see this message all the time, even right after reboot ?


Using driver 275.19

Failed to initialize NVML: Unknown error
on GTX285s & GTX480s compute nodes 2 GPUs/node
using SYS V init.d script

But functions fine on my headnode using xorg & 1 Quadro FX 380

NVIDIA suggests trying the just posted beta drivers 280.13. I’ll report my results when the problematic machine is back online (currently down waiting for a replacement part to arrive).

Thanks DrAnderson42 for the advisory,

I now have nvidia-smi on all my nodes: 480’s and 285’s
I have also a previous ver. 275.21 working on computes with C2050s
I will soon replace the 285s with 580s.
If I have problems I will report here.

Driver: 280.13 and 275.89

Card : Quadro 4000

Windows 7 Enterprise and Windows 2008 HPC Server

In all mentioned configurations I get the same error:

Failed to initialize NVML: Unknown Error

Please help!

Oh, sorry, I didn’t see that this was Linux specific.

Anyway, I solved the problem. I need a second vga card by nvidia. The display adapter must not be by another manufacturer if you want to use that program. I can accept this, but why not return an appropriate Error-Message which would tell me that.

Nevertheless CUDA rocks!