Why does it not work

Hi:
I tried to write a code to do “And” operation for two binarized images.(only “0” and “1” in the images)
What i did was as:

unsigned int xIndex;
unsigned int yIndex;
unsigned int index_in;
xIndex = blockIdx.x * BLOCK_SIZEX + threadIdx.x;
yIndex = blockIdx.y * BLOCK_SIZEY + threadIdx.y;
index_in = yIndex * width + xIndex;
shared float share1[BLOCK_SIZEX][BLOCK_SIZEY];
shared float share2[BLOCK_SIZEX][BLOCK_SIZEY];
share1[threadIdx.x][threadIdx.y] = *(Source1+index_in);
share2[threadIdx.x][threadIdx.y] = *(Source2+index_in);
__syncthreads();
if(share1[threadIdx.x][threadIdx.y]>0 && share2[threadIdx.x][threadIdx.y]>0){
*(result+index_in)=1;
}

But it does not work and return cuda memory io error(cudaError).

but when i changed it to be:

if((share1[threadIdx.x][threadIdx.y]+share2[threadIdx.x][thr
eadIdx.y])==2){
*(result+index_in)=1;
}

then it worked. why?