shared problem


I have a float * A of width=wA, pitch=pA and height=hA.

I try to store parts of A in shared memory shared_A.

In the example bellow, I try to store the part [lig_A,lig_A+BLOCK_SIZE][col_A,col_A+BLOCK_SIZE].

I also use thread block of size (BLOCK_SIZE,1,1).

What I don’t understand is that the following makes ma GeForce crash

tmp = A[(lig_A+l)*pA + col_A+tx];

shared_A[tx][l] = tmp;

And when I use the following lines, it works…

tmp = A[(lig_A+l)*pA + col_A+tx];

shared_A[tx][l] = 1.0;//Just an example

I don’t understand why I can set in the shared memory 1.0 and not tmp which is a float too…

It makes me crazzzzzzzzzzzzzy!

Maybe I do something wrong but I don’t understand what…

Thanks for your help!


Do you have a __synchthreads() after the shared memory write?

Problem fixed. It was only a little bug…
My bad :-)