Compute Capability 3.0 Documentation?

Has anyone seen any documentation from nVidia regarding Compute Capability 3.0? 2.1 was a fantastic upgrade over the 1.x devices, so I’d like to see what the new Kepler architecture brings to the table on the CC3.0 platform.

I haven’t been able to find anything – has anybody else?

The CUDA Programming Guide (Version 4.2), Appendix F.5 gives a description of compute capability 3.0. The major change is the massive upscaling of the multiprocessor to include 192 CUDA cores. Not many other things have been added, however, compared to the jump from compute capability 1.3 to 2.0.

The updated PTX manual with CUDA 4.2 also indicates a new “shfl” instruction that can move register contents between threads in a warp without using shared memory. It appears that major changes in functionality for Kepler will not be revealed until the more compute-focused GPUs are released later in the year.

And where do you get the version 4.2? The latest available at developer.nvidia.com (or even through the registered partner web site. partners.nvidia.com) is the version 4.1 with documentation dated 11/18/2011.

Ah, an NVIDIA employee posted a direct link in the forum:

I don’t know why this isn’t being distributed via the registered developer page, or directly on the NVIDIA website yet.

Cool, thank you. That topic should be sticky at least until it goes live on the web site.