Hello friends, I need some help to understand shared memory declaration.
Let’s say i have very simple code to pass a 4 by 4 matrix from device memory to shared memory:
//this matrix is 4 by 4 matrix
global void kernel(float * matrix)
int row = blockIdx.y blockDim.y + threadIdx.y;
int col = blockIdx.x blockDim.x + threadIdx.x;
d_array[row4 + col] = matrix[row4+col];
- When I say shared d_array, how many d_array are created in my shared memory?
- In shared d_array, is this " 32" the size of my array?
- When this kernel is finished, is this matrix still stored in my shared memory so that my next kernel can directly use it?
- Is this way of passing values correct : d_array[row4 + col] = matrix[row4+col]?
Many many thanks!