Changes between CUDA 4.0 RC2 and RC1 ?

Hi tmurray – any significant changes between RC1 and RC2 worth mentioning? I didn’t see anything in the release notes.



Uh, if you were on TCC or the original RC1 driver in Linux, app startup times and memory consumption per process should be significantly reduced. Other than that, lots of little bug fixes.

Is GPU Direct 2.0 (direct peer memory access) supported on all Fermi GPUs now, or is it still like RC1 which restricts it to Tesla?

Oh yeah, the peer-to-peer functions work on all Fermis now (so long as you support UVA), not just Teslas. Forgot about that one!


Are there any issues with heterogeneous mixes? GF100 with GF104 with GF110?

And, while talking about peer-to-peer memory access , is it even possible in theory that CPU memory indirection support could be added? Ie, the host CPU dereferences a 64 bit pointer which is transparently mapped to a GPU’s memory. No function calls, no copies, just a read or write to a pointer.

GPUs have such access to host memory already (Zero-copy introduced in CUDA 2.0), so if the host could access the GPU memory… that’d be an elegant completion! It would simplify a lot of setup coding.


Not really.