entry level GPU for CUDA learning

I am assembling a new pc and Im planning on working with CUDA(I will take a course next semester involving CUDA programming).
Which GPU would you recommend? My budget is low so Im focusing on entry level GPUs.
Some viable options so far would be

Quadro 600 http://www.nvidia.com/object/product-quadro-600-us.html
and either GTX 550Ti or the 560Ti http://www.geforce.com/Hardware/GPUs/geforce-gtx-550ti/specifications

So far Im focusing on 2.1 capability, number of cores, and well…price
Should I be looking for something else too?

Thanks in advance.

Don’t buy a Quadro card - they are meant for professional graphics and their price tag carries a heavy surcharge for software and service you are not interested in.

GTX 550Ti or 560Ti are good choices with a limited budget. Note that regarding CUDA compute capability 2.0 is superior to compute capability 2.1 (it is easier to saturate all cores on 2.0 devices than on 2.1). So the “GTX 560Ti 448 Core Limited Edition” would also be a good choice if it is still within your price range.

If you plan to use only for learning and gaming I sugest to buy the cheapest Fermi card that would satisfy you needs. If you plan to do multi-gpu programming than you should buy 2 which you can afford. If you start to do high performance you will find that you wriet very fast a parallel program on your card at home, but then you will spend lots of time on a cluster with TESLA cards to optimize it.

I consider GeForce GTX 560 Ti with 448 CUDA Cores (see the screenshot below) as the best solution for entry level CUDA Programming.

Let me ask you guys 2 questions:

  1. What’s the deal with 1536 CUDA Cores on the forthcoming model GTX670 ?
  2. What’s the difference between CUDA 2.0 and 2.1 ? Which is better and why ? Should I care for this version as an entry CUDA programmer ?

CUDA 2.1 has 48 per multiprocessor , while CUDA 2.0 has 32 cores per multiprocessor. Since a warp has only 32 threads you need to do extra work when prpgramming to be sure that you use all 48 cores on a multiprocessor. The 6xx series is a new architecture and it seems to be considerable different from the Fermi and the transistors are smaller.

Thanks @pasoleatis :)

Considering these additional 16 threads, do you think that CUDA 2.0 constitutes a better approach for beginners ?

[Sorry for the typos in my post above, just fixed them now.]

For beginners, to first order the difference between compute capability 2.0 and 2.1 is negligible - both are 100% compatible. Performance wise however CC 2.0 is better than 2.1 both because CC2.0 devices have higher absolute performance and because it is easier to reach close to peak performance. So I’d recommend CC 2.0 unless your goal is to either set yourself an extra challenge or you want to create code that performs well on a wide variety of cards (although the latter probably requires buying a range of cards with different compute capabilities anyway).)