My GPU is hanging. Ie I have aborted (via kill -9, exit status 137)
the linux process using it, since it was looping.
All subsequent access to it cause CUDA to again loop on the first
cudaMallocPitch() and have to be manually aborted (giving exit status 131).
In user mode (not root) I have tried /usr/bin/nvidia-smi -r -i 0
but this gives:
“Resetting GPU is not supported on this device”
Which GPUs does nvidia-smi reset support?
Is there anyway out of this without rebooting Linux?
Thank you
Bill
ps: nvidia-smi is prone to give “Invalid combination of input arguments…”
It would be nice if I could follow its hint about what it does not like
about the command line I have used.
May be the documentation could include a few more examples?
Have you tried unloading the “nvidia” driver?
That would still require logging out of any GUIs on other Nvidia cards, but is less disruptive than a full reboot.