best CUDA-enabled card for $100 (or so)

First off, sorry if this is in the wrong section. I was about to start a thread in the hardware section until I saw all of those threads dealing with gaming and what card works best for games. I don’t play computer games. Crazy right? But I don’t.

However, I do need to program in CUDA and am looking to do a slight upgrade. Is 100 bucks gonna get me a great card? I wouldn’t think so. However, I currently have an older GeForce 8600 GT with only 32 cores.

So strictly for the purposes of CUDA programming, is there a card (less than $100) that you can recommend as a decent upgrade?

Here’s one example of a card that may fit the bit (with my limited knowledge of what is best):
http://www.tigerdirect.com/applications/searchtools/item-details.asp?EdpNo=4683600&SRCCODE=GOOGLEBASE&cm_mmc_o=VRqCjC7BBTkwCjCECjCE

Thanks in advance.

Since most new features in CUDA are requiring Fermi now, I’d consider the GTS 450 as another option:

http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100007709%20600082239&IsNodeId=1&name=GeForce%20GTS%20400%20series

It can be found for pretty close to $100. In terms of raw compute performance, it can be slower or faster than the GTX 260, depending on how well the instruction scheduler can use the extra group of 16 CUDA cores in each SM. (Compute capability 2.1 has kind of variable performance due to the SM design.) The theoretical peak memory bandwidth is quite a bit lower (57 GB/sec instead of 112 GB/sec) with the GTS 450. However, the GTS 450 does have the L1 and L2 cache, fast atomics, support for more C++ features, floating point atomics, and all the other things mentioned in programming guide.

So the GTX 260 is a good choice for raw speed, but lacks Fermi features of interest. GTS 450 allows you to play with some of the newer CUDA features but loses in memory bandwidth and sometimes in compute performance. (Unless the new features, like the cache, help your code.) Also the GTS 450 a little more than half the electrical power of a GTX 260, which keeps your case cooler.

WOW. What a great and detailed reply! To be honest, I’m perhaps more confused now, but good information nevertheless. I just watched a vid on Fermi and read some information on it. So it is the newer, “complete” GPU computing architecture. How does that help me and what advantages will it provide in my scenario? Who knows.

But I gather from you that the GTS 450, having the newer architecture, is a better option over the GTX 260.

The best card is, as Seibert suggested, the biggest Fermi card you can afford, just because the architecture offers significant new features and software support which make it the most desirable to develop with. But best and fastest might not be the same thing. The fastest card is usually the one with the largest memory bandwidth. I would guess that a GTX260 for $100 will be faster than a GTS450 running a lot of existing code. But I wouldn’t be buying a GT200 for new code development when there is a Fermi available for the same money, even if the Fermi winds up being a bit slower.

It depends on what your scenario is. Are you looking to apply CUDA to a new problem? Do you have existing CUDA code that you would like to run faster? What kind of programming are you doing?

I would call it a qualified “better.” If my goal was to run one of my older CUDA applications as fast as possible for $100, I’d go with the GTX 260 you found. If my goal was to write new code that would use some of the Fermi features to be simpler or faster, I would get the GTS 450 and upgrade later if I needed more speed.

The decision is ambiguous because the GTX 200 series is now discounted to the point that it has the cheapest memory bandwidth in the CUDA product line.

I have existing code that I would like to run faster. Additionally, I will make new code. However, for my purposes, the code is quite basic, so I’m not too sure if the Fermi features would be helpful at this point. Guess I’ll just have to read up more on the Fermi features to see how necessary those are.

Thanks again.

Now that I read more, I realize that this GTX 260 is old and even listed by nVidia as “previous generation”. On the flipside, in its “hayday”, it was indeed the GTX line and a nice GPU. Even as it is still sold, prices are more than $200. Apparently, the only reason tigerdirect has it for 99 bucks is because BFG went bankrupt. Point: it seems like $99 is a great value for a card that is sold for $200+ just about everywhere else.

On the flipside, the GTS450 is quite new, albiet without perhaps the high bandwidth and other features. It is not the higher GTX line. And even as new item, the cost is only $130 or so.

Again, makes me feel the GTX 260 must be a great value. But it concerned me reading the words “previous generation.”

Mind you, I’m just thinking out loud (err, typing) at this point.

Could you (or someone else) elaborate on this. I haven’t coded with CUDA in a couple years and am curious as to what features you may be referring to.

Thanks.

If I went the route of getting a GPU with Fermi architecture, would it be smarter to pay an extra 30 bucks for a GTX 460 over the GTS 450?

Here’s two links:

GTX 460: http://www.newegg.com/Product/Product.aspx?Item=N82E16814130562

GTS 450: http://www.newegg.com/Product/Product.aspx?Item=N82E16814130572

After doing a lot of reading about the GTS 450, the GTX 460, and now the new GTX 550 Ti (?), which just came out, it seems like the GTX 460 get a LOT of love. At $140, would that be the better choice?

Anyone?

Today there seem to be better deals available on Newegg: Either go for the 1GB version of the GTX 460 from EVGA (even cheaper from Palit), which comes at the same price with even slightly cheaper shipping, but has more memory and higher memory bandwidth.
It’s what I would buy, as it has both best performance and price/performance of the offers, although the difference to the GTS 450 is gradual. The GTX 550 Ti is both inferior to the GTX 460 1GB and more expensive, so not an alternative.

Okay, it is clearly narrowed down to the GTX 460. One final question for someone who is familiar with this specific card, that is, the GTX 460.

I’m comparing two versions, both of which are the SAME price,

the 768 MB version: http://www.newegg.com/Product/Product.aspx?Item=N82E16814130562

the 1 GB version: http://www.newegg.com/Product/Product.aspx?Item=N82E16814130591

One would tend to think the 1GB would be more favorable, or at least popular, but when you read ratings around the internet, people seem to LOVE the 768 model. At newegg, the 1GB version has 4 stars (which is still good), but the 768 MB version has a full 5 stars, with over 200 ratings.

Not being a gamer, the extra memory isn’t necessarily a big deal. Also the 1GB model has the higher 256bit interface. Perhaps people who program with CUDA want the extra cuda cores on the 768 MB version (336 vs 288).?.

Just the more I read, it seems as though people felt the GTX 460 768 MB card broke the mold and was perfect for what it was meant to do. I plan to get that card as of now, unless someone has something enlightening that I am not aware of.

Finally, I’m not an expert on hardware by any means and wanted to see if my motherboard would be a bottleneck in any way. I built my machine two years ago with a Gigabyte board.

Here’s the link on Newegg:
http://www.newegg.com/Product/Product.aspx?Item=N82E16813128359

Is there anything in the specs that would show that the GTX 460 would be limited by my board?

Thanks.

I would definitely go with the GTX 460 1 GB version. The 768 MB version doesn’t have extra cores. They both have 336 CUDA cores, and the 1GB model has more memory bandwidth in addition to more capacity. (If you see someone quote 224 cores for a GTX 460, it is because they are using an out of date program to query the card parameters. For some reason, the CUDA function to get card properties only returns the number of multiprocessors, but not the number of CUDA cores. You then have to multiply that by 8 for compute capability 1.x, 32 for compute capability 2.0 and 48 for compute capability 2.1. Older query programs assumed that all capability 2.x cards were 32 and thus report the wrong value.)

As for the motherboard, that looks fine. You might not get quite as much host<->device bandwidth as a newer socket 1366 motherboard, but the difference is not huge.

Actually, you are correct in that the 768 MB and the 1 GB both have 336 processors. However, the 1 GB link I gave was for the GTX 460 SE, which has 1 GB, but it does have less cores. There are two 1 GB models, one being about 40 bucks more than the SE version.

Oops, you’re right. I had never seen the SE version before, and incorrectly assumed it related to one of the minor clock variations that the third-party card manufacturers are so fond of. (Sometimes I think the people at NVIDIA who make up product names are trying to confuse us!)

That said, I’d probably still go for more memory bandwidth over more CUDA cores. Many more kernels are memory bandwidth limited than you would initially expect. The differences we’re talking about are minor. Either card looks like a good deal, and will be so much better than your 8600 GT. :)

Oh sorry, I hadn’t meant to point you to the SE. I originally wanted to link this GTX 460 1GB for 146$, and then apparently have fallen into the SE trap myself.

That looks like a good card, but not a lot of ratings and with an average of 4 star. In those ratings, most people refer to the MSI Cyclone as the one to get, which has upwards of 230 ratings at 5 stars. Great card. But ultimately wanted a bit better longer term security and customer service, so I went with the evga 1gb version: 01G-P3-1371-AR.

Lifetime warranty. Found it for 165 bucks, free shipping. Hard to beat.

Thanks for all the help.

…now it’s time to make a new thread on the best way to setup a system…but first, I’ll use the rarely noticed search function.