Inconsistant Values within Kernel


I am doing some comparisons between two values within a kernel and writing a 1 to a matrix if they are equal and 0 if not. However, when I read a location from the matrix where it is supposed to produce 1, i find it a 0. If i read the value from the matrix immediatly after its written inside the loop, i get the correct value. But if i read it after the loop is done, the result is incorrect.

__global__ void

propagateGPU1(clause** clauses, vecp* dclauses, int* assign, short* dmatrix, size_t pitch, int* testv, short* stestv)


	int i;

	int j;

   const unsigned int tid = threadIdx.x;

for(i = 0; i < dclauses->size; i++){

       short* row = (short*)((char*)dmatrix + i * pitch);

       for(j = 0; j < dclause_size(&clauses[0][i*64]); j++){

                  if(assign[tid] == (clauses[0][i*64].lits[j])){

                            row[tid] = 1;

                            //if(tid == 15 && i == 0 && j ==0)*stestv = dmatrix[15];   <--- when this is enabled, it produces a 1



                           row[tid] = 0;

                          //if(tid == 15 && i == 0 && j ==0)**stestv = dmatrix[15];




if(tid == 15)**stestv = dmatrix[15]; <---- this gives a 0


I know for a fact that the result should be 1 based on the inputs which I have checked within the kernel are correct. both assign[15] and clauses[0][0].lits[0] are equal so the result should be 1. I am not doing anything between the loop and reading the value from the matrix but for some reason the result got changed.

pitch size is = 128 and the problem i am having is at i = 0 and j = 0

Could anyone please help me understand this problem? Your help is very much appreciated…


how do you launch the kernel? Is there only one block?
If not row[tid] is written by more than one thread at once. threadIdx.x is related to the threads block and restart with zero in every new block.

Thank you for your reply. I do appreciate your help. At the current stage, I only have one block, but you did alarm me to notice this issue when I do have more blocks.

Fortunatly, I figured out the problem just when I put my head on the pillow last night. The problem was the next iteration of j would cause the thread to write a 0. A stupid mistake but the moral of the story, don’t code/debug when you are tired and its 5 in the morning.

I’ll keep your suggestion in mind as I might have made that mistake when I start making more blocks. without your input. Thanks again…