Cuda security

Hi there !

I wonder if there is a mechanism that can protect the graphics cards from incorrect programming (mainly memory overflow) or even malicious code.

I’m asking this because while developing I had to reboot my computer (under Linux) because my Cuda program has done something the graphic cards didn’t like. There was no more screen updates, the only thing that could move was the mouse. The remaining of the system was still running (music, network etc.) but I was unable to get back my X server or to switch to console (Ctrl+Alt+F1).

Another time, my Cuda program has been able to write to the video memory due to incorrect memory access. This time, there was no consequence because the screen could be refreshed by moving windows around.

This could also leads to Cuda programs interfering with other Cuda programs.

So, is there a solution ?

There is some level of memory protection in CUDA. It is certainly possible to crash your kernel by writing to bad locations in memory, but you shouldn’t be able to crash the whole machine. I would consider that a bug.

If possible, I would recommend debugging your kernels in emulation mode, where you will get an exception if you write out of range.

If you can use a two separate GPUs - one for CUDA and one for display, this can also help.

I’ve seen CUDA apps crash X, but never the whole machine (except on pre-release hardware). I leave an ssh server running on my machine and when the X screen freezes up due to a bug in my CUDA app, I log in and reboot the system from ssh.

It’s certainly possible to crash CUDA, or even the X server (if you’re running it on the same card), but as the G80 has a separate memory space for each context, malicious code shouldn’t ever be able to write to other programs their memory spaces.

Just being curious, is security the rationale behind that serious limitation that we cannot have data that are shared between cuda contexts ? This is a real pain (or at least a design issue for CUDA) while programming on SMP/multicore.

Cédric [which should learn to re-read a post again instead of modifying it twice]