Array in shared memory from kernel Create array in shared memory from kerne

I would like to create an array variable from the kernel to read global memory data into. My code is like this:

__device__

uint2* getList(uint index, int size, uint2* globalMemoryPtr)

{

     uint2 sharedMemPtr;

   

    //Copy from globalMemoryPtr to sharedMemPtr

   

    return sharedMemPtr;

    

}

__global__

void myKernel()

{

    uint2* sharedMemPtr = getList(threadIdx.x);

}

This code gives the warning of returning a local variable and it gives the error:

undefined reference to `__vla_alloc’ on the line where where sharedMemPtr is created.

I tried, against better knowledge, the cudaMalloc function but this crashes everything. My understanding was always that this function is to allocate global memory from the host.

Well, to my question. What would be a good way of reading the global memory array data into a shared memory array?

Any help/suggestions are appreciated.

no, you simply can’t do that.

from the C language, local variables, including arrays exist until the end of {} scope, that is sharedMemPtr is released just after your getList completes, so the pointer it returns is meaningless.

you can’t dynamically allocate shared memory. Plan your algorithm so that it knows beforehand how much it needs. General cuda kernel strategy is that each kernel should work with small amounts of “personal” shared memory.

Thanks for the quick reply. I wanted to delegate all the responsabilities of getting data to this functions, but I will have to do it the more standard way of allocating a buffer in the kernel and sending it to the functions copying from global to shared memory.