nvidia-smi reset not supported on this device

My GPU is hanging. Ie I have aborted (via kill -9, exit status 137)
the linux process using it, since it was looping.
All subsequent access to it cause CUDA to again loop on the first
cudaMallocPitch() and have to be manually aborted (giving exit status 131).

In user mode (not root) I have tried /usr/bin/nvidia-smi -r -i 0
but this gives:
“Resetting GPU is not supported on this device”

Which GPUs does nvidia-smi reset support?

Is there anyway out of this without rebooting Linux?

Thank you
Bill

ps: nvidia-smi is prone to give “Invalid combination of input arguments…”
It would be nice if I could follow its hint about what it does not like
about the command line I have used.
May be the documentation could include a few more examples?

Hi wlangdon,

Please take a look at output of nvidia-smi -h

$ nvidia-smi -h
[...]
Supported products:
- Full Support
    - NVIDIA Tesla Line:
            S2050, C2050, C2070, C2075,
            M2050, M2070, M2075, M2090,
            X2070, X2090,
            K10, K20, K20X, K20Xm, K20c, K20m, K20s
    - NVIDIA Quadro Line:
            410, 600, 2000, 4000, 5000, 6000, 7000, M2070-Q
            K2000, K2000D, K4000, K5000, K6000
    - NVIDIA GRID Line:
            K1, K2, K340, K520
[...]

Which GPU do you have?

Can you provide an example of nvsmi queries that fail and you have trouble finding out what is causing it?

Regards,
Przemyslaw Zych

Dear Przemyslaw,
Thank you for pointing this out.
The error is with a GeForce GTX 295.
I am “curing” it with hard reboots at present:-(

/usr/bin/nvidia-smi --id 0 gives following error message
Invalid combination of input arguments. Please run ‘nvidia-smi -h’ for help.

Sure its easy when you know, but an example such as: nvidia-smi --id=0
would make the doc easier to use.

Thanks again
Bill

Have you tried unloading the “nvidia” driver?
That would still require logging out of any GUIs on other Nvidia cards, but is less disruptive than a full reboot.