What hardware to get?

jimpadimpa · August 9, 2008, 1:17am

I’m aiming at buying a Tesla C1060 and put together a stand-alone computer for the sole purpose of running computations. I figure I’ll need the C1060 for the 4 GB memory as my computations use a lot of memory (I now run them using CPUs). I’m just starting out with CUDA so I’ve got a lot to learn.

I’m curious, what CPU (speed, multi-core), RAM (speed/size) and motherboard (speed, components) do I need to get to have optimal computational performance of the Tesla/GPU? In case I need to do a lot of round trips to and from the GPU versus rarely? If I’m able to run all computations on the GPU, does it even matter what CPU and RAM I use as long as I got a PCIe slot? I want to have the best performance without any “overkill” hardware.

MisterAnderson42 · August 9, 2008, 12:05pm

If you run everything on the GPU, the CPU matters not at all. I haven’t had a chance to play with the new G200 chips yet, but my 8800 GTX system is running on a ~4 year old CPU and I don’t notice a bit of performance difference in other systems with modern CPUs.

Well, I shouldn’t say not at all: because if you are running a multi-GPU system you need one CPU core per GPU, as CUDA spin-waits for tasks to finish on the GPU.

If you need the best CPU->GPU memory copy performance, be careful with Mobo chipset you get. You can find benchmarks scattered around the forums for some, but the PCIe v2 Intel chipsets have consistently outperformed the PCIe v2 NVIDIA chipsets (to the tune of 6 GiB/s vs 4 GiB/s). If you don’t care too much about the memory copy performance, then any Mobo with a PCIe 16x slot will do.

jimpadimpa · August 9, 2008, 1:16pm

If you run everything on the GPU, the CPU matters not at all. I haven’t had a chance to play with the new G200 chips yet, but my 8800 GTX system is running on a ~4 year old CPU and I don’t notice a bit of performance difference in other systems with modern CPUs.

Well, I shouldn’t say not at all: because if you are running a multi-GPU system you need one CPU core per GPU, as CUDA spin-waits for tasks to finish on the GPU.

If you need the best CPU->GPU memory copy performance, be careful with Mobo chipset you get. You can find benchmarks scattered around the forums for some, but the PCIe v2 Intel chipsets have consistently outperformed the PCIe v2 NVIDIA chipsets (to the tune of 6 GiB/s vs 4 GiB/s). If you don’t care too much about the memory copy performance, then any Mobo with a PCIe 16x slot will do.

[snapback]422910[/snapback]

Thanks for the response. Do you know what RAM was used to get 6 GB/s? I wonder if using faster DDR3-memory improves PCIe performance…

Tom’s hardware RAM speed test

MisterAnderson42 · August 9, 2008, 5:57pm

I know some of the benchmarks reaching that level were with DDR2. I’m not sure if there were any with DDR3. PCIe gen2 x16 only provides 8 GiB/s theoretical. Mark Harris says (The Official NVIDIA Forums | NVIDIA) that there is 15% overhead for PCI-e transfers => The max you can expect is 6.8 GiB/s, assuming that the chipset is capable of delivering that in the first place (something it appears 780i’s NF200 bridge limits).

jimpadimpa · August 10, 2008, 1:43am

So, DDR2 800 and DDR3 800 (max 6.4 GB/s) is not optimal. But anything better should be enough. I don’t know if transfer from RAM to GPU is dependent on thing such as FSB-speed. “Stumbled” upon this:

RAM/FSB

Slowly starting to see what kind of hardware I need…

tmurray · August 10, 2008, 5:51am

I get 4.6GB/s HtoD and DtoH on an X38 motherboard with DDR2 and an 8800 GT. At work, I get 6GB/s on whatever Xeon motherboard is in my machine (with FBDIMMs and all that) to a C1060.

seibert · August 10, 2008, 2:40pm

I just built a GTX 280 workstation today, and got 6.3 GB/sec DtoH, 5.3 GB/sec HtoD pinned memory bandwidth. This is with a 2.6 GHz quad-core Phenom on the MSI K9A2 Platinum (AMD 790FX chipset) and some cheap DDR2-800 memory. So it looks like this motherboard can saturate the DDR2 memory bandwidth, at least in one direction.

(Only warning about the MSI board. For some reason it defaulted to disabling PCI-E 2.0, rather than auto-detecting the card capability. Just had to flip that option in the BIOS and then I got full speed.)

Topic		Replies	Views
CUDA on 2x260 what hardware is the best for me? CUDA Programming and Performance	6	6354	November 13, 2008
CUDA Tesla CUDA Programming and Performance	1	1940	August 21, 2008
CUDA System Which system around the Tesla C1060 CUDA Programming and Performance	7	2624	September 29, 2009
One powerful GPU vs. several low-end GPU's Which is better? For "embarassingly" parallel CUDA Programming and Performance	9	13121	March 25, 2010
Why Tesla? CUDA Programming and Performance	27	33674	November 20, 2008
Should I buy Tesla or GTX295 CUDA Programming and Performance	5	9932	January 19, 2010
Used C1060s? Where to find? CUDA Programming and Performance	11	2822	November 24, 2009
Tesla C2050 slower than GeForce 8800? CUDA Programming and Performance	14	20934	April 20, 2011
How many C1060 per node CUDA Programming and Performance	4	3286	May 5, 2009
Graphics Card or Tesla selection of hardware for computations CUDA Programming and Performance	14	5257	October 23, 2008

What hardware to get?

Related topics