Dedicated CUDA card

Hi everyone,

I have been using CUDA on a dedicated workstation (that I don’t administer) for my research until now but am in the process of sorting out what options I have for continuing my work in a new environment. I’m not very hardware savvy at all.

In particular, do most/all workstations support having two graphics cards installed? I would like to have a dedicated CUDA card for long jobs as well as a standard graphics card (doesn’t need to be any good) for rendering. Is there anything to be careful of? I’m expecting to work in a linux environment, if that matters.

Also, I’ve heard that the quadro cards are better suited to long jobs than the gtx cards, but that the gtx cards are more cost-effective for speed at the cost of some reliability. Is this still the common wisdom?

Many thanks,
Anthony

The dual-card setup you mentioned is certainly fine. CUDA toolkit’s Linux Release Notes should have most of the information you need.

GeForce cards are not suitable for heavy double-precision load. Their single-precision FLOPS are way better, though.

Thanks!

I was just reading something about PCIe slots, and basically it seems like most motherboards only have one 16x PCIe slot. When one has two cards, does this affect the rate of data transfer between the GPGPU and main memory? Also (and showing how little I know about hardware), is it always possible to connect two cards even when there’s only one 16x slot?

I looked at the release notes, and it seems to indicate that for a dual graphics card system, the display graphics card needs to be an Nvidia card as well - is this true?

Most low-end motherboards have a single x16 PCIe slot, but they also have at least one or more x1 or x8 slots as well. Yes, if you install two x16 cards into two x16 slots you usually split the bandwidth between the two slots (i.e. effective x8 to each card). You should very carefully check the manufacturer specs with respect to this issue if you are desiring full x16 bandwidth to each card. Most motherboard manuals describe how the PCIe bandwidth is distributed amongst the available slots.

I run a GTX 460 OC card as a compute only card plus a GeForce 210 card for graphics … I’m probably the only person on the planet who does this. CUDA kernels run with out limit on my GTX 460 and the rendering is done on my GeForce 210. I’ve been doing this for quite some time without any problems.

I probably wouldn’t mix-n-match graphics card vendors due to potential driver problems.

dpe