Terminate CUDA kernel which got stuck in an endless loop? Is that possible under linux?

Sometimes kernel goes into an endless loop (by a mistake in a kernel of course). And I’m unable to somehow terminate it.
kill -9 of the hosting process fails, process remains (still the process is not zombie, it is marked as just running).
Have to reboot. Definitely unusable approach.

Driver: 180.60
Linux: Debian/lenny
Monitor is attached to another video card, so there is no CUDA timeout.

Is there any solution to terminate such process/kernel?

Odd, I usually have luck just pressint Ctrl-C. Sometimes it takes ~10 seconds to take effect, but it usually works. The only times a reboot has been necessary for me is with horribly buggy kernels that wrote all over device memory, probably messing up the driver.

I have just tested with while(true) {}. No memory use…
Just found that Ctrl-C really helps, but in about 30 minutes (!).

Maybe there can help an unload of some module (nvidia driver?) Any ideas?

A fix for that is coming, but not until after 2.1 is out.

Thank you. But are there any tweaks for the present moment? Such as driver unload or something like that?

A fix in what form? Quicker return after Ctrl+C, or some larger-scale solution? Will it work on Windows?

Just a guess:

Extend (or) change your desktop into this graphics card temporarily to kill the kernel

Beware: If that did not work, you wont have a display to work with :-) Extending would be a better idea… But not sure if linux supports it.

Cool idea, however I’ve never heard this is possible under X in linux… I mean extending the desktop

Thanks. If possible, write a script to switch the display and then get it back to the original display.

Not sure how to write it OR if it would even work. Good Luck!

Trying to unload the driver in such a case or doing other things can hang the entire pc, or at least the driver unloading, until the kernel terminates… be warned. At least, that’s my experience.