GTX 590 for GPU. any comments?

xnov · November 12, 2011, 8:38am

I’m planning to buy a GPU card for scientific computing
I’m kind of new in GPU, so i dont really know which hardware is the best. my budget is up to 1000usd
after browsing in internet, I found GTX 590 as the best candidate that fits with budget
anyone has comments about GTX 590? any other suggestions?

FYI, i’m gonna use it for n-body simulation and computational fluid dynamics (CFD)

alex

pasoleatis · November 12, 2011, 8:52am

The 590 GTX is composed of two cards. Maybe someone can explain how are computations handled on dual cards. I wonder if the memory is common or transfers are needed between threads running in the same time on each card?

One thing the 590 GTX does not support CUDA compute 2.1 only 2.0.

tera · November 12, 2011, 10:59am

The two devices on the GTX 590 are completely separate. You have to manually adapt the program code to split the work between them and transfer data from one to the other as needed.

For CUDA, compute capability 2.0 is actually better than 2.1. The two are fully binary compatible, but (depending on the code) CC 2.0 devices are up to 50% faster per clock and core than CC 2.1 devices.

xnov · November 12, 2011, 11:34am

so do you think cc2.0 is enough? is it better to buy 1 GTX590 with cc2.0 than 2 GTC 560 Ti with cc2.1?

tera · November 12, 2011, 12:57pm

Yes, you should prefer CC 2.0.
I’d also go with the GTX590 as it only needs one PCIe slot, but that would depend a lot only your current system configuration and your future intentions.

pasoleatis · November 12, 2011, 3:31pm

How is it possible that the CC 2.1 is slower than 2.0? Why did they change the arvhitecture? Is it better for gaming?

seibert · November 13, 2011, 2:24am

CC 2.1 has 48 CUDA cores per multiprocessor, grouped into 3 sets of 16, much like CC 2.0 has 32 CUDA cores per MP grouped in 2 sets of 16. However, CC 2.1 can still only issue instructions from 2 warps at a time, so the third set of 16 CUDA cores will only be active if 2 instructions can be issued from the same warp. The co-issued instructions from the same warp have to be independent of each other (i.e. read and write different registers), but an independent instruction won’t always be available. As a result, the third set of 16 CUDA cores will sometimes go idle.

Given equal clock rates, CC 2.1 is the same as or faster than (depending on instruction sequence) CC 2.0 per multiprocessor, because the extra CUDA cores in CC 2.1 are sometimes helpful. However, CC 2.1 is the same as or slower than CC 2.0 per CUDA core, because sometimes the extra CUDA cores in CC 2.1 go idle. Since we usually compare CUDA devices based on clock rate and number of CUDA cores, people often say that CC 2.1 is slower than 2.0.

(Exception to the above: CC 2.1 has twice as many special function units (8) per multiprocessor as CC 2.0. For code that spends a lot of time computing special functions, CC 2.1 could be faster regardless of utilization of the third set of 16 CUDA cores.)

pasoleatis · November 13, 2011, 7:44am

Wow thanks. I wonder if there are comparisons for the fft between different cards.

seibert · November 13, 2011, 10:31pm

The other reason CC 2.1 is considered slower than CC 2.0 is just the particular GPU configurations NVIDIA has decided to market. The slowest CC 2.0 desktop GPU I can find has 11 multiprocessors (GTX 465) and the fastest CC 2.1 device (GTX 560 Ti, but not the OEM version) has 8 multiprocessors. Due to clock rate differences, the GTX 560 Ti almost certainly beats the GTX 465, and has the possibility to beat the GTX 470 for some floating point instruction sequences. Otherwise, the two populations have no performance overlap.

Certainly, restricting to just the GTX 500 series, all existing desktop CC 2.0 devices are faster than all CC 2.1 devices, regardless of the relative merits of the different multiprocessor capabilities. When extrapolating performance from CC 2.0 to CC 2.1, use ratio of [clock rate] * [# of multiprocessors] as the scale factor. Then, if your code makes use of the extra special function units or CUDA cores per MP on CC 2.1, you will be pleasantly surprised.

Topic		Replies	Views
GTX 590 2 processors CUDA Programming and Performance	1	1692	August 18, 2011
entry level GPU for CUDA learning CUDA Programming and Performance	6	4565	March 24, 2012
Hardware questions Experienced programmer, newbie to GPUs, seeking information sources. CUDA Programming and Performance	4	6037	January 4, 2010
Seek advice on latest fermis CUDA Programming and Performance	14	1944	September 1, 2011
Need Hardware suggestions Is the GTX295 a good choice as GPGPU? CUDA Programming and Performance	10	8508	February 19, 2010
G210, GT220 deviceQuery? CUDA Programming and Performance	30	15016	November 21, 2009
GTX 590 CUDA power tests CUDA Programming and Performance	40	10326	January 29, 2012
code compatibility gtx 590/tesla c20** CUDA Programming and Performance	0	8256	August 9, 2011
Buy a graphic card CUDA Programming and Performance	1	1350	April 15, 2008
Disappointed performance using C2050 CUDA Programming and Performance	20	7918	September 2, 2010

GTX 590 for GPU. any comments?

Related topics