GeForce 210 for $20 after rebate Perfect for display-only GPU when doing CUDA development

I took advantage of this deal. Also note the MSI card has 512MB of slightly overclocked GDDR5 memory giving it some healthy bandwidth.

This seemed like a good idea, so I got one of these MSI GeForce 210 cards, though from newegg.com. I’ve just plugged it in - there are good things and bad things…

Good things: Yes my GTX260 is now compute only! So it also gets to use all its memory for computing, etc. The 210 seems to be plenty fast for what I do - I was worried about 3D capabilities in matlab, but that seems perfectly adequate as well. And no more peculiar X behavior when a compute run is underway. I suspect also that less power will be used, since the 260 will not have to be powered up for me to read e-mail or browse the web.

Bad things: The PCI-e bandwidth has dropped to 8X rather than 16X now that I have two cards in; confirmed with bandwidthTest. I think this is the standard behaviour when two cards are inserted. The motherboard (ASUS M4A79XTD) BIOS will not boot with video to the 210, but insists on the 260 instead. I don’t seem to be able to change that. I’ve tweeked my /etc/X11/xorg.conf file so that the X server starts to the 210. But now I face the endless task of plug-unplug between the 260 and the 210 as I boot up or have to reinstall the nvidia module with each kernel update. That will be awkward.

I was wondering how CUDA would select which card to compute on, but by default it seems to use Device 0 which is the GTX260 (the primary or 16X PCIe slot). Presumably, if I swapped the 210 and 260, CUDA would want to compute on the 210 by default. (Swapping is impossible…there isn’t enough room for the 260 on the other PCIe slot.)

I suspect a PCI (not PCIe) card would be a more optimal solution here, at least for me. The BIOS could be set to boot to it and I would retain the 16X PCIe capability. Except that I have no free PCI slots left. I’ll keep using the MSI 210 for now…but may yank it out if it gets too annoying.

How many lanes your card gets depends on your motherboard chipset. There are motherboards that offer two “true” x16 PCIe lanes (totaling 32, 35, 38 or so) . But once you get past two x16 your options start to dwindle. Either the northbridge’s PCIe lanes are divided up into permutations of x16 and x8 (or worse), or they’re running through a PCIe switch to present more lanes (which is cool).

Regarding which GPU CUDA selects, that’s entirely controllable by your program. The bandwidthTest program is a good example of code that lets you select an individual GPU via the “–device” command-line option.

Do you have a BIOS option to boot off of which card has a display connected? I’m guessing no.

I feel your pain on slot and card width arrangements. It’s like Jenga and you don’t often figure it out until after you’ve bought the hardware. :|

One other comment, a PCI card is a good idea that I recently looked into. If you check eBay there are “Quadro NVS” cards with PCI interfaces going for relatively low bucks. There are also PCIe x1 cards as well but they are more expensive since x1 is probably a little faster … and you get a whopping 8 or 16 CUDA cores as well. :)

Just to note that to select the compute device the code:

cudaSetDevice(currentDevice);

where currentDevice is 0 or 1, seems to be all that is required. (This from the bandwidthTest routine.)

A cheaper route besides buying the 1x pcie graphics card (ca. $100) would be to buy a 1x pcie network card (ca $20); this would free up a pci slot (assuming one is taken by a network card) for a $20-30 graphics card. But its is a tangled web of buying, shuffling cards, and reconfiguring. Its exhausting… :)