CUDA Laptop A discussion on Benefit-Cost Ratio.

How about here? (© Google Translate)

“CUDA-Enabled Products”

Yeah, I’m not aware of any AGP cards (though there is the regular PCI card mentioned).

Part of the reason for this is bus speed. CUDA depends a lot on how fast you can move data between the GPU and CPU. PCI on consumer motherboards tops out at 133 MB/sec nominal, while AGP in its final form hit just over 2 GB/sec nominal. PCI Express started at 4 GB (for an x16 slot that a graphics card goes into) nominal, which in practices reaches 3.2 GB/sec in CUDA. Now it’s double that, and next year it will double again.

The other aspect is marketing: The gamer audience that fuels much of the high end card sales were lining up to buy these cards because of DirectX 10 support. (DX10 support in NVIDIA cards came at the same time as CUDA) That demographic is more than willing to upgrade to a newer motherboard with a nice bus to support their new video card. This means that there is little demand to build a new GPU chip with a native AGP interface, so building an AGP card would require some kind of translator bridge chip. That seems to be both hard and expensive.

hehe… the problem is that now I have to correlate with all laptops I would search for pricing…

“sparkle 8500gt pci”

Thanks… also, it seems there are others pci with CUDA, 8600gt as it seems has like 3x as much GFlops:



Found another one! here:

Seems reasonable… How many Flops you guys think it reaches? =)



Good morning,

So, the best way to find the right graphic card with the right Cuda Compute Capability is to check the list of graphic card sorted by category in

Then you take the list of Cuda Compute Capability for each card that you will find in…Guide_2.2.1.pdf

(it’s exactly the same data table that jph4599 finds in 2.3 beta)

And finally, you go on to know if your graphic card exist in a laptop, but I discovered that alienware has an edge cause they are already selling laptops with GTX 260M that I found nowhere (I want the cheaper laptop with CCC 1.3 :rolleyes: ).

You are aware that the GTX260M isn’t a 1.3 capability part? It is a mobile derivate of the G92b (9800GT), and is only compute 1.1.

And does exist a laptop with Cuda Compute Capability 1.3 ?

It seems that the best graphics card in laptop is GTX 280M with 14 Multi proc and CCC 1.1 and not really cheap :(

There are presently no compute 1.3 mobile parts that I am aware of. There should be some compute 1.2 parts coming later in the year, but if you want double precision, then the big PCI-e X16 desktop/server boards are the only way to get it for the moment.

Do you think that CUDA features and size of GPU are linked ?
What I mean is, do you think that a embedded GPU could allow to use CUDA even if you have less MP, less memory …

Sure. The midrange mobile parts are certainly perfectly functional and work well for development and for small and mid sized problems. A 16 or 32 core 9xxxx part will probably comfortably outperform the host dual or quad core mobile CPU it is paired with. But there is no getting around the limitations of laptops for performance computing generally, and for CUDA it is no different.

Yes, of course you right ! What I mean is do you think that a functionality like atomic functions or double precision could be disabled because of GPU size ?

Other linked questions, what area of GPU is managing this functionalities ? Does it need a lot of space ? Do they add just some connections or modules ?

Double precision is certainly not present in mobile GPUs because of the chip area required for a double precision floating point unit in each multiprocessor. The compute capability 1.2 mobile GPUs coming later this year (which have everything the 1.3 GPUs have except DP) are possible because of NVIDIA transitioning to the smaller 40 nm process.

For me it’s ok, thank you very much for your help : avidday and seibert !

I hope that nVidia will improve quickly its mobile GPUs for developer who want to use CUDA on laptop without loosing too much features.

The best strategy I suppose is to wait better laptop, cause to buy now is expensive, bad investment and consequently not very smart, right ?

Umm, I don’t regret buying that laptop with a 9600M GT GPU. The laptop was quite inexpensive.

I was talking about laptop for CUDA development…

I need to buy a laptop and if I do it now, I’ll not have all Cuda features (because there is only CCC 1.1 and I need atomic functions) moreover in approximatively 2 months, a new range of laptop with CCC 1.2 or 1.3 will appear on the market, to buy two laptops in less than a quarter, it’s not very smart, don’t you think so ?

?? Atomic operations on global memory are in compute 1.1. It is the main addition that compute capability brings.

No arguments there.

I see it like this: The current generation hardware is quite cheap (and will even get cheaper) with next generation on the horizon.

If shared memory atomics and other Compute 1.2 features are a need for your application, then wait of course.

In my case I’ve always been happy with Compute 1.1 and global atomics (allowing for inter-block synchronization primitives and such)


Exact, It’s not really expensive until CCC 1.1 is enough and the number of multiprocessor isn’t very important.

Another question, how the number of multiprocessors and GPU computing time are linked ? Are they very influenced by the algorithm, or not ?

In fact, the number of MPs change a lot from a graphic card to another (on laptops I see 1Mp to 14Mps) so how can you estimate your actual computation time on another graphic card with more or less Mps ?

Computation speed scales with GPU (shader) Clock and Number of MPs

Contributing factors for memory bandwidth: dedicated (++) or shared ram (–)

                                                             GDDR2 (-) or GDDR3 dedicated RAM (+)

                                                             memory bandwidth: 64 bit (--) or 128 bit or 256 bit (++)

                                                             better coalescing in compute model 1.2/1.3 (+) vs compute model 1.1 (-)