Disabling specific CUDA GPUs How do I prevent CUDA code from running on the low-power onboard GPU?

I administer a 64-bit Ubuntu 9.04 system with four Tesla C1060s which is used by a number of research groups. The machine works great in general, but apparently certain multi-device CUDA code doesn’t run properly because the machine also has an onboard GPU (a nForce 980a) which shows up as a usable CUDA device and obviously doesn’t have the specs of the C1060 cards.

Strangely, the devices are numbered as follows by default:
Device 0: “Tesla C1060”
Device 1: “nForce 980a/780a SLI”
Device 2: “Tesla C1060”
Device 3: “Tesla C1060”
Device 4: “Tesla C1060”

I’d like to disable running CUDA code on the nForce onboard card completely to avoid these issues, but I’d rather not do so in the system BIOS. It’s important that the nForce card stay active so that it can run the monitor attached to the box, since the C1060s don’t have video outputs.

The system currently runs developer driver version 195.36.15 and CUDA toolkit version 3.0.

I’d appreciate any input anyone might have. Thanks very much!

Use nvidia-smi.

Thanks for the quick reply…

I have already tried setting the compute mode on the nForce card to Prohibited, but multi-GPU programs (such as simpleMultiGPU in the SDK) still attempt to use the card by default and produce a “no CUDA-capable device is available” error. Is there a way to prevent the nForce card from appearing to CUDA software at all?

You can change the permissions/ownership of /dev/nvidia1 (the onboard) so running a cuda app will fail on that device (could not open … ).
Using --extract-only for the nvidia driver would extract the thin kernel module source code, where in nv-reg.h there are options of how the module handles the /dev/nvidia* files and permissions.

For the numbering, maybe you can mknod the nvidia files manually, or try hard links to put the onboard last. I don’t know how to hide it.

There’s a new feature to address this in an upcoming release.