Dynamic Shared memory Allocation

Dear All,

I am trying to dynamically allocate multiple arrays that will reside on the shared memory. Here is a code snippet:

global void kernel_function(char *Oarray1, float *Oarray3, short *Oarray2, int a) {

 extern __shared__ char array[];

 char *array1 = (char*) array;
 short  *array2 = (short*) ALIGN((char*) (&array1[a]), sizeof(short));
 float *array3 = (float*) ALIGN((char*) (&array2[a]), sizeof(float));

Where ALIGN is a device function that will take care of the alignment. I launch the kernel using:

kernel_function<<<dimGrid, dimBlock, (size_t) shared>>>(ddata1, ddata3, ddata2, a);

I get the following error:

ptxas /tmp/tmpxft_00006b0d_00000000-2_dynamic-shared.ptx, line 0; fatal : (C6017) Unaligned access for SMEM (unknown symbol) in entr
y _Z15kernel_functionPcPfPsi; the offset should be 4-byte aligned.

But if I replace a with a constant, the program works fine.

Any ideas or code example that shows how can I perform these steps where the size of the shared memory array is not know till runtime?


I have the feeling that the problem might somehow be related to the way how [font=“Courier New”]ALIGN[/font] takes care of the alignment. Can you post it’s definition?

Also, does the calculation of [font=“Courier New”]shared[/font] take alignment into account?

By the way: You can get rid of ALIGN if you just order the arrays to have decreasing natural alignment:

extern __shared__ char array[];

float *array3 = (float*) array;

short *array2 = (short*) &array3[a];

char *array1 = (char*) &array2[a];

Ordering the array works.