building the best CUDA machine what hardware should be used?

triden7 · March 7, 2007, 8:07pm

I’m building a highly scalable computing program using CUDA, and I think it will benefit from multiple GPUs tremendously.

I plan to create a CUDA context on each GPU, however as I understand my primary device shouldn’t be used for GPU computing.

Is 3 CUDA GPUs the best I can expect with today’s technology? The best motherboard I can find is here:
[url=“ASUS L1N64-SLI WS Dual L(1207FX) SSI CEB AMD Motherboard - Newegg.com”]Computer Parts, PC Components, Laptop Computers, LED LCD TV, Digital Cameras and more - Newegg.com

however it only has 4 PCI Express slots, one of which will presumably be needed for the primary adapter (leaving only 3 for CUDA G80’s.)

Generally, the application I am building benefits most from more streams, so in this case I am anticipating i’ll have 128x3 = 384 streams to work with if we get the CUDA context management working as we want it to.

Is 384 the maximum number of streams I can expect to achieve with todays technology?

seibert · March 7, 2007, 10:27pm

The motherboard you link to will not accept three 8800 cards along with an additional graphics card to be the primary video adapter. All of the 8800 cards I have seen include a very large and heavy heatsink/fan cooling system on top of the GPU and memory. It takes up two slots, with the second slot used as a direct vent outside the case. Installing three cards will cover up every other available slot on that motherboard.

That said, if you are planning to use Linux (which is probably a good idea with such an unusual video configuration), you could easily work with the system remotely via ssh. A loss of video interactivity on a headless system won’t matter to you. I think it was reported in another thread that you can use CUDA on Linux without starting X servers for every card. You just need to ensure the kernel drivers are loaded. (I haven’t tested this personally, so you might want to confirm that.)

Your next limiting factor is going to be power for the cards. I think the GTX is rated at peak power usage of 185W, so you’re going to want a power supply that can at least do 3x that. Of course, it will be more subtle than that, since what matters is the amount of power supplied on the rails that power the cards, and not the total power.

I think running 3 cards in one system is going to be a bit of a challenge, but you might get it to work. (If so, let us know what parts you used!) I’m almost certain 4 cards would need some custom hardware.

(Edited to fix power consumption of GTX.)

I’m building a highly scalable computing program using CUDA, and I think it will benefit from multiple GPUs tremendously.

I plan to create a CUDA context on each GPU, however as I understand my primary device shouldn’t be used for GPU computing.

Is 3 CUDA GPUs the best I can expect with today’s technology? The best motherboard I can find is here:

http://www.newegg.com/Product/Product.asp?..N82E16813131146

however it only has 4 PCI Express slots, one of which will presumably be needed for the primary adapter (leaving only 3 for CUDA G80’s.)

Generally, the application I am building benefits most from more streams, so in this case I am anticipating i’ll have 128x3 = 384 streams to work with if we get the CUDA context management working as we want it to.

Is 384 the maximum number of streams I can expect to achieve with todays technology?

[snapback]168664[/snapback]

tachyon_john · March 8, 2007, 12:10am

We’ve had success building two systems based on the Asus P5N32-E SLI motherboard.
They have Intel Core 2 Quad CPUs, 8GB memory, and 3 GeForce 8800GTX cards each.
The hard part of building one of these has been getting our hands on high capacity PSUs.
With all three GPUs and all four CPU cores running flat out, our test system uses 700 watts, as measured by a Kill-o-watt that we have the machine plugged into. We’re running Linux and I’ve had no trouble running code on all 3 GPUs at once, regardless whether or not X was running. Though I think with X running you have to keep your kernel invocation GPU runtime below 5 seconds, but all of my current test kernels run in 5 seconds or less (per invocation) now, so that hasn’t been a problem.

John Stone

triden7 · March 12, 2007, 6:31pm

Are there any adapters for PCI Express x16? Something to maybe extend the slot out a bit so I can get the 3 cards installed.

Our development environment is XP so swithcing to linux to get the display-less setup going isn’t a very attractive option. Ideally we would have 3 non-primary displays in XP, on a motherboard with 4 pci express 16 slots.

triden7 · March 12, 2007, 6:56pm

I am still leaning towards 4x PCI slots - To be honest, switching to a linux development environment is unattractive because of our current windows-based toolset.

I am looking at this here:
[url=“http://www.adexelec.com/pciexp.htm”]http://www.adexelec.com/pciexp.htm[/url]

Specifically, the “PE-FLEX16” flexible extender (to help get 3 G80’s in one machine.) Can anyone please comment as to whether this could have an impact on the reliability of the computations?

seibert · March 12, 2007, 7:27pm

This motherboard appears to have the right slot layout to fit 3 GeForce 8800 cards, and one single slot video card at the end to be the primary display:

http://www.gigabyte.com.tw/Products/Mother…%20Quad%20Royal

Note that only the blue slots are x16. The black slots (even the full length ones) are x1, so one of your 3 CUDA cards would be handicapped in data transfers to and from the CPU. Another potential problem is that the full length cards would possibly hit the RAM slots (rightmost card in the photo), or cover up the IDE and/or SATA connectors.

Topic		Replies	Views
CUDA hardware & software CUDA Programming and Performance	9	2665	November 13, 2010
board recommendation / headless dedicated / chipset tradeoffs CUDA Programming and Performance	18	10630	July 3, 2009
CUDA Graphics Card suggestion Low end range PCI Express 1x CUDA Programming and Performance	9	6217	September 28, 2008
Using more than 1 CUDA card at a time. Physics simulations flat out flying on GPU CUDA Programming and Performance	12	12542	March 12, 2010
GPU+CUDA cards on Fedora Core 8 Which cards work? CUDA Programming and Performance	10	12064	August 22, 2008
Server Motherboards for mulit-GPU systems (&Fermi) CUDA Programming and Performance	26	21080	November 12, 2009
Advice on first CUDA system CUDA Programming and Performance	13	2687	July 7, 2009
What do I need for a 4 GPU CUDA Setup? CUDA Programming and Performance	17	8098	November 21, 2008
Kudos to Cuda and nvidia CUDA Programming and Performance	5	11872	June 11, 2007
Hardware Recommendations Recs for hardware for GTX 275 or 285 on Linux CUDA Programming and Performance	20	24170	January 13, 2010

building the best CUDA machine what hardware should be used?

Related topics