My problem seems to be really simple so I apologyze to start.
I have an array in global memory:
I want to load to shared memory of size 16 *16
Is this expression correct and will end with 64 different blocks with my data loaded?
real, shared :: s_array(0:15,0:15)
s_array(tIdx,tIdy) = g_array(d_j,d_l)