using a component in the share memory twice

in the following much simplified kernel example,

global void Kernel2(float *P, float *P_d, float *cost_d)

shared float Ps;

Ps[6] = P[ggIndex]; /* global to shared memory */

cost_d[ggIndex] = Ps[6]Ps[6]; / shared to global */
P_d[ggIndex] = Ps[6]Ps[6]; / shared to global */

it seems like P_d array may have problem saving shared memory. why is that?
any comments?

You either have to specify the shared memory array as extern and launch the kernel with the desired amount of shared memory (bytes) you want to use as third parameter after grid and block size, or you have to specify the size of the array in the array declaration.

You either have to specify the shared memory array as extern and launch the kernel with the desired amount of shared memory (bytes) you want to use as third parameter after grid and block size, or you have to specify the size of the array in the array declaration.

i made it too simple to show only necessary part.

global void Kernel2(float *P, float *P_d, float *cost_d)

shared float Ps[3333];

tx = threadIdx.x; ty = threadIdx.y; tz = threadIdx.z;

bx = blockIdx.x; by = blockIdx.y;

dx = blockDim.x; dy = blockDim.y; dz = blockDim.z;

smIndex = tx + tydx + tzdxdy; lmIndex = bx + bygridDim.x; ggIndex = smIndex + lmIndexdxdy*dz;

Ps[6] = P[ggIndex]; /* global P to shared memory Ps*/

cost_d[ggIndex] = Ps[6]Ps[6]; / shared to global */

P_d[ggIndex] = Ps[6]Ps[6]; / shared to global */

again i am showing only portion of it with key part.

when i am trying to put Ps[6]*Ps[6] second time to P_d, it seems not working. the first one is O.K.

i made it too simple to show only necessary part.

global void Kernel2(float *P, float *P_d, float *cost_d)

shared float Ps[3333];

tx = threadIdx.x; ty = threadIdx.y; tz = threadIdx.z;

bx = blockIdx.x; by = blockIdx.y;

dx = blockDim.x; dy = blockDim.y; dz = blockDim.z;

smIndex = tx + tydx + tzdxdy; lmIndex = bx + bygridDim.x; ggIndex = smIndex + lmIndexdxdy*dz;

Ps[6] = P[ggIndex]; /* global P to shared memory Ps*/

cost_d[ggIndex] = Ps[6]Ps[6]; / shared to global */

P_d[ggIndex] = Ps[6]Ps[6]; / shared to global */

again i am showing only portion of it with key part.

when i am trying to put Ps[6]*Ps[6] second time to P_d, it seems not working. the first one is O.K.