So now that I have my code in an almost-fully-functional condition (will it ever actually be?? – stupid pointers),
I am considering upgrading my old 8800 GTS 320mb to a gtx 275.
I am trying to justify the purchase with an idea of how much performance I will gain on my code, thus making my thesis that much more appealing, and so i have a few questions:
Background info: my code right now uses 32 registers per thread, and around 14400kb of shared memory per block. My most optimal config is 256 threads in 96 blocks. (33% occupancy).
Based on a scaling of clock speeds, # of MPs, and increased register count, it looks like I should get around a 3x performance boost over my 8800 with the gtx 275. Can anyone else confirm similar numbers?
Just considering the 275 on its own now, a single precision code is expected to be how much faster than a double precision version? I have a bandwidth limited code. Considering that bandwidth is halved, and there are only 1 double precision compute units per MP (8 threads), does that mean I will have a 1/16 performance hit in switching to double?
Cooling/overclocked cards: Is running an intense cuda program more or less taxing on the card than running an intense game like crysis? If this card is to be used for heavy cuda-ing, should I get a 275 from a brand that has better cooling, or does it not matter? Also, would it be risky of me to get a factory overclocked card (like the BFG)? I think these all depend upon whether or not cuda is any more or less taxing than games (if cuda is < games, then anything good for games is good for cuda.)
Once I do make the plunge, I’ll let you know what improvements I see.
[Of course, this wont end up being an apples-apples comparison because I am also going to drop an intel i7-920 into the rig (currently have a core 2 duo e6600) and so i would expect the gpu/cpu speedup to be reduced due to the i7 kicking butt. (Im still only going to use one thread though).]