Can we dynamicly allocate memory in kernel code now?

CUDA 3.0 with Fermi is supposed to support dynamic memory allocation in kernel code, but I cannot find any function to do this job in the reference manual. So does it really support dynamic memory allocation in kernel code? And where can I find the API on this?

The Fermi white paper says that the hardware architecture is designed to support dynamic memory allocation, but I don’t think that feature has been exposed in the CUDA toolkit yet. I don’t see any mention of it in the CUDA 3.1 programming guide. It looks like they are adding support for the advanced language features incrementally. The 3.1 beta adds function pointers, but no dynamic memory allocation yet.