GPU clusters for HPC

HELLO,
I wish to work on GPU clusters for HPC. Please suggest which type of GPUs should I use for making cluster?
From where should I start? Please help…

NVidia provides Tesla solutions in 1U boxes with interconnect cables and host cards: http://origin-www.nvidia.com/object/tesla_computing_solutions.html
You can build your onw (simple) cluster using PCs with multiple-PCIE motherboards and GeForce/Quadro/Tesla cards. Tesla and Quadro solutions are more expensive, but they have such advantages as ECC-powered serious memory amounts and full-speed double precision computations (GeForce cards have less double-precision capabilities).
The next step is software. You can prefer Unix-like operation systems (Linux, for example). I’m not familiar to cluster building (yet :rolleyes: ) and I can only advice you to read this google answer, that provides useful links: http://answers.google.com/answers/threadview/id/759126.html

For an HPC cluster, Tesla GPUs with ECC support and full double precision are the right choice.

All the major OEMs and several smaller companies offer 1U/2U servers with Tesla inside and even preconfigured clusters.

Thanks all for reply. Actually I want to know that how can I run two GPU’s in parallel with host CPU that divide workloads b/w those GPU’s depending on the application requirement… I want to work on LINUX Environment…which softwares do I need to use???..I mean CUDA will be use to handle single GPU???..what do to if to make GPU work in parallel??? From where should I start after purchasing Graphic cards…???

The only way to make CUDA GPUs efficiently work together is manually distribute load between them. This will be task-specific approach. CUDA programming guide can help you - there is a section about using multiple devices. SLI technology is for graphics only, not for GPGPU.

Softwares you’ll need: NVidia Driver, CUDA Toolkit, and (optionally, but required for concepts learning) GPU Computing SDK Code Samples. The list of links is allocated on cuda download page: http://developer.nvidia.com/cuda-toolkit-40
First, choose operating system supported by NVidia SDK, and install it. The next step will be driver installation - you can prefer DevDriver from CUDA Download page (it’s not required, but guarantees compatibility with toolkit).

If you prefer Linux, I can advice you something RPM-based, like Fedora (user-friendly :rolleyes: ). One year ago i had problems installing driver on Ubuntu.

Whatever GPU you use, do “not” use an Operating System:

  1. That which occupies more than 1/2 of System RAM just to sustain itself
  2. That which is busy drawing rectangles and squares on screen than listening to what its user wants
  3. That which stands like a 500 pound guerilla between you and GPU performance.

Hi everyone,

Sorry, I’m a newbie but looking at this thread it seems to imply that it is possible to cluster GeForce cards for some HPC gain? My CUDA application works fine with single precision on my GeForce GTX 460M and I’m pretty sure most of my processing bottleneck is just a need for more threads (it’s a relatively low memory program). So I’m thinking theoretically I could throw like, 10 GeForce cards at my program and see some nice speedup? This is assuming I can find a suitible chassis, enough CPU cores, PCIe slots, etc. I was just wondering if there was any sort of inherent property of the GeForce series that puts a limit as to how many GeForce cards I can cluster. Thank you!