Dynamic memory allocation inside kernel

I know that mallocing inside the kernel should be avoided wherever possible, but for testing purposes I’d like to be able to write a simple piece of code to have one of my kernel functions allocate a small array. However, I still get an error about calling a host function when trying to allocate memory inside a device function. nvcc --version says that I’m running release 4.0, V0.2.1221 and as far as I know dynamic allocation on the GPU works with versions > 2.0. Do I need to compile with a special flag? Or is there something else that I need to do to get it working.

For reference the only flag I’ve been using thus far is --machine 32.


dynamic memory allocation inside a kernel is only possible for compute capability 2.x onwards.
So you have to add “-arch=sm_20” or higher to your compiler options.