What I’ve noticed so far with the CUDA 2.0 beta
Compatibility with 1.1: HOOMD ( http://www.ameslab.gov/hoomd ), a high performance GPU accelerated molecular dynamics simulation program, compiled and ran flawlessly on CUDA 2.0 with no changes
No performance delta: At least with the benchmarks in HOOMD, I see no appreciable performance delta from CUDA 1.1. Ok, so it isn’t faster, but at least it’s not slower :) Register usage might be +/-1 for some kernels, I didn’t check exhaustively.
Bug fixes in place, such as the cudaMemcpyToArray performance, and probably many more I haven’t noticed
Additional features: There is the mentioned 3D texture support. I also noticed that events are improved somewhat and there are more device properties available from the query(like the number of multiprocessors). And a new warp size built-in variable that could be useful in some circumstances. There might be some other new features I’m missing here.
Documentation improvements: There are man pages on linux now: very handy for a quick lookup of a function. “man cudaMemcpyToArray” is much faster than “open pdf; wait; wait; wait; ctrl-F ‘cudaMemcpyToArray’; wait; wait; next; next next; finally at the reference”. And the separation of programming/reference guides is good in theory, but… (see below)
- There still doesn’t seem to be any way to “cast” a pointer to be a global pointer and get rid of the Advisory can’t tell what pointer is pointing to warnings. Did I miss it, or did this not make it into 2.0 beta?
… haven’t found anything else bad yet
- While separating the programming and reference guides is good, in theory, the implementation leaves something to be desired. The old appendix in the 1.1 guide nicely listed all functions by category. The new reference guide just gives a straight list, alphabetized I think.