The CUDA 2.2 beta is available to registered developers–if you want to become a registered developer, sign up here.
A brief overview of CUDA 2.2 beta features:
- Zero-copy support (see this thread for more information)
- Asynchronous memcpy on Vista/Server 2008
- Texturing from pitchlinear memory
- cuda-gdb for 64-bit Linux (it is pretty great)
- OGL interop performance improvements
- CUDA profiler supports a lot more counters on GT200. I think this includes memory bandwidth counters (counters for each transaction size) and instruction count. In other words, you can very easily determine if you’re bandwidth limited or compute limited, which makes it far more useful than it used to be.
- CUDA profiler works on Vista
4GB of pinned memory in a single allocation (except in Vista, where the limit is still 256MB per allocation, but I think this is going to be raised between now and the final release)
- Blocking sync for all platforms. Whether this made it into the headers for the beta, I’m not entirely sure–I’ve heard conflicting reports and need to check this afternoon. Basically, it’s a context creation flag where instead of spinlocking or spinlocking+yielding when a thread is waiting for the GPU, the thread will sleep and the driver will wake it up when the event has completed. It’s not the default mode because you’re at the mercy of the OS thread scheduler which will sometimes increase latency, but if you want to minimize CPU utilization, it’s very nice.
- Officially supports Ubuntu 8.10, RHEL 5.3, Fedora 10
There’s one last feature that didn’t make it in the beta that I think is the best feature in 2.2 (even compared to the dramatically improved profiler, zero-copy and the 64-bit debugger), but I don’t want to spoil it…
Edit: Here’s the 2.2 beta programming guide.
edit 2: I am bad at not revealing surprises. There’s still a second surprise in the final release for Windows users, though.
edit 3: Surprise 2: a test version of /MD CUDART. I revealed it because I want feedback on it and whether anyone has objections to moving everything over to /MD going forward.