I’m trying to create a dynamic array(of dim3) inside a kernel’s registry or shared memory. cudaMalloc is a host funcion, thus it is unavailable. I cannot use static arrays, because the size of my array depends on a parameter to the kernel. A workaround would be to create an array for each thread in the global memory, but that’d reduce the speed of the program. Is there any way to create a new array inside a kernel? like a T _array; or something?
Thanks in advance.