CUDA 3.3 suggested improvements.

I hope will have:

*3d grid support now ptx files contain correctly code but runtime fails
*3d surface writes
*2d texture arrays and 2d surface arrays
*Some form of pointing grid and block sizes from gpu mem similar to DispatchIndirect

Also OpenCL
*CL 1.1 support
*3d_image_write extension
*Some extension exposing 2d image arrays and related 2d_image_array_write
*Similar DispatchIndirect extension
*Pinned host mem equivalent for OpenCL
*clCreateEventFromGLsyncKHR efficient ogl interop:

On OpenGL:
*arb_cl_event extension support


I would prefer NVIDIA releasing a quick 3.3 that fixes all the bugs and performance issues with CUDA 3.1 and 3.2.

3.4 can integrate all these new features – just like Intel’s Tick-Tock model (or) Linux Kernel Revisioning

It’s all a matter of nomenclature you could say also 3.2.x to bugfixes similar to Qt numbering scheme…

Anyway no problem if 3.3 but 3.4 should imply we are at least half a year before seing these that causes me to realize that Nvidia Directcompute driver has this features

since spring 2010 so seems CUDA is not the state of the art platform anymore (well I’m exagerating)…

Also before finishing would be good to have some GPU callbacks mechanism (see GTC poster) implemented and also have exposed more efficient CUDA OpenGL interop via events similar to OpenCL 1.1 enhancements pointed before…

Seems confirmed cuda 3.3 will have cuda-gdb for MacOS X just wanting to note…

Mac OS cuda-gdb preview is now available @

I hope you can add finally also VS2010 support now SP1 beta is in place and add fermi support for cuda-gdb on macos!

Share device pointers across host threads!