Dear All,
I am trying to dynamically allocate multiple arrays that will reside on the shared memory. Here is a code snippet:
global void kernel_function(char *Oarray1, float *Oarray3, short *Oarray2, int a) {
extern __shared__ char array[];
char *array1 = (char*) array;
short *array2 = (short*) ALIGN((char*) (&array1[a]), sizeof(short));
float *array3 = (float*) ALIGN((char*) (&array2[a]), sizeof(float));
Where ALIGN is a device function that will take care of the alignment. I launch the kernel using:
kernel_function<<<dimGrid, dimBlock, (size_t) shared>>>(ddata1, ddata3, ddata2, a);
I get the following error:
ptxas /tmp/tmpxft_00006b0d_00000000-2_dynamic-shared.ptx, line 0; fatal : (C6017) Unaligned access for SMEM (unknown symbol) in entr
y _Z15kernel_functionPcPfPsi; the offset should be 4-byte aligned.
But if I replace a with a constant, the program works fine.
Any ideas or code example that shows how can I perform these steps where the size of the shared memory array is not know till runtime?
Thanks.