How to decide dynamic shared memory size?

Hi! I want to use 128*32=4096 float values in dynamic shared memory for smem_b, but I find out I have to use 8192 but not 4096, otherwise will have incorrect value!

extern __shared__ __align__(16 * 1024) float smem[];
float* smem_b = smem;
float* smem_a = (float*)&smem_b[4096];//Here! Must use 8192! Otherwise will be incorrect!

Also, does the smem_b here still only take up 128*32*4/1024=16KB? Or…use 8192 will use 32KB?? Thank you!!!

Thank you!!!

How many elements should be stored in smem_a and smem_b, respectively?
What the the shared memory size parameter that you pass to the kernel?

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.