Less GDDR2 X More GDDR3 Performance and Useability in CUDA.

Hello everyone! :D

I am a begginer in CUDA, but I am pretty much interested on learning about parallel computing and GPGPU stuff.

Fot that reason, I’m gonna acquire a 8600gt so i can start working with that, but i’m not so sure how important memory bandwidth and size is in certain applications. I expect to run Monte Carlo simulations as well as ALU intensive tasks. No 3D modeling or video rendering will be used in this card.

I may get a 256MB GDDR3 clocked at 1.6Ghz (GPU @ 600Mhz) or a 512MB running at 800Mhz (GPU @ 540Mhz which, btw, can be easily oc’ed to 600Mhz i guess, as it comes up with active cooling). The first one is little more expensive.

If that matters somehow, I do care about how fast a VGA is, of course. But I care even more about desiring to “explore” the technology without being limited by insurmountable boundaries, like lack of memory.

If I need more speed, i can push it to the tops by tweaking or overclocking GPU, mem and Shaders. But if i need memory size, there will be nothing i can do. (Would CUDA share system memory if needed like a Turbo Cache? I know it must cause the application to run a way slower, but it might help breaking that “insurmountable boundarie” if needed…) Everytime i read a CUDA task related topic, i see the term “bandwidth”, that’s what makes me to be afraid of getting a GDDR2 video card.

So, what do you guys think? Should i go for the fastest or the “wider” one?

Thank you so much! External Image

CUDA is usually limited by the number of multiprocs. There are 32 multiprocessors in the 8600 cards. If you can find an 8800 card, those have 96-128 multiprocessors. This matters A Lot more than the memory bandwidth.

Anything less than an 8800GT is a waste of money IMO, compared to an 8600 the extra $ are always worth it.

Txs for your reply, kristleifur! External Media

So, i can assume that GPU should bottleneck the overall performance before the task takes all of the high speed 256MB, right?

Yeah i was actually thinking of getting a 8800gs, but it is about twice more expensive… i really can’t afford it right now.

Sorry, but what would you think if instead getting a better card, i spend less than a half of the 8600GT price for a 8400GS? Just to learn about CUDA compiling, parallel enhanced programming and doing some light maths, then next year i get a 8800, 9500 or 9600GT which are going to be cheaper, i hope. Wouldn’t a G98 at 450Mhz with GDDR2 at 800Mhz and 16 SPs do the trick for me for awhile? External Media

Still I’ve found a review that compares two 8600gt with GDDR2 and GDDR3. In this case, memory clock difference is just 400mhz, but i can’t see any major performance result in favor to the faster one… I expected it to reach 25% in medium resulution games where 256MB run just fine. Should i foresee the same small contrast in CUDA? Link Here

Any hint or suggestion is very welcome, fellows. Txs! B)

You can learn how to compile CUDA apps without a GPU, this is one of the benefits of emulation.

People mention memory bandwidth because it is quite frequently the limiting factor when using the cards rather than the computational rate. I would guess that if you use nearly all of the memory of the card you’ll be limited by memory bandwidth during computation.

Comparing the two 8600gt cards at the site you referenced, both cards have nearly the same peak computational rate of 75.5 GFLOP/s (at 1 MADD/cycle). Where they differ is memory bandwidth, with the DDR3 card at 67.2GiB/s, twice that of the DDR2 card at 32GiB/s. Memory bandwidth of the card is calculated by taking the memory clock in GHz (1.4 for the DDR3 card), multiplying by the DDR rate (3 for DDR3), multiplying by the number of bytes the memory bus is wide (128/8 for both cards).

As for how much memory you need, it really depends on the problem you are solving. While you can’t do a ‘turbo cache’ you can usually break a parallel problem into smaller parts and send each part to the GPU independently.

I’d go with the 256MB DDR3. As aakova said, if you’re using so much memory that 256 is not enough - you’re going to be bandwidth bound anyway so DDR2 will clamp your performance. In most cases you’ll be able to split your computations to smaller chunks anyway.

Also, if you have so much data (over 256MB) you’ll want a stronger card to crunch through it quickly. A 8600 will take ages to eat through 500MBs of data, especially if the bandwidth additionally slows down the throughput. (Ages compared to a 8800 or 9800, it can still be much faster than a CPU)

8600GT is not a bad card at all, I think it’s a good way to get into CUDA. I have one. However where I live a 8800GTS (the G92 based one with 512MBs) costs only about twice as much and sports four times the power so it definitely has better cost/performance ratio.

I have just bought an EVGA 8800GT from Newegg to learn how to write CUDA apps…I’m pretty happy with it so far, and I got it for a great price. I don’t see why you’d go with the 8600, since it’s much less powerful and not much cheaper…

Sorry guys, i’m back!

That is a workaround, but seems to be just enough for me. Txs! External Media

That’s what i was said. The GPU can’t handle pretty much anything that consumes all that memory. Also i was thinking, if some video card builders put only 320MB in a 8800GTS which contains 3 times more shader processors than a 8600, memory size should not be a problem in that price range. ^_^

Yeah 112 shaders should be very nice for CUDA. The point is, first of all, it costs 250% more than a 8600gt GDDR3 in here. Plus, my 25A 12v combined PSU could not support an 8800gt with my strongly overclocked processor… not to mention that it is going to spend 95% of its working time in idle, so power consumption would be a big waste of money for me altough the card will be working underclocked when performance is not needed. :(

I’m going to order it tomorrow and 8600gt GDDR3 is my choice so far, unless somebody changes my mind, of course. :)

Txs everyone for your opinions! External Media