I administer a 64-bit Ubuntu 9.04 system with four Tesla C1060s which is used by a number of research groups. The machine works great in general, but apparently certain multi-device CUDA code doesn’t run properly because the machine also has an onboard GPU (a nForce 980a) which shows up as a usable CUDA device and obviously doesn’t have the specs of the C1060 cards.
Strangely, the devices are numbered as follows by default:
Device 0: “Tesla C1060”
Device 1: “nForce 980a/780a SLI”
Device 2: “Tesla C1060”
Device 3: “Tesla C1060”
Device 4: “Tesla C1060”
I’d like to disable running CUDA code on the nForce onboard card completely to avoid these issues, but I’d rather not do so in the system BIOS. It’s important that the nForce card stay active so that it can run the monitor attached to the box, since the C1060s don’t have video outputs.
The system currently runs developer driver version 195.36.15 and CUDA toolkit version 3.0.
I’d appreciate any input anyone might have. Thanks very much!