Set global memory inside conditional statement?

Hello people I am having a problem was a very simple string comparision kernel. I experience a crash on the line “g_odata[tid]=1;”. If the if statement is removed so that all threads execute the “g_odata[tid]=1;” line then the kernel runs happily. The kernel also works without error in Emu mode. I feel like I must have missed something very fundamental. Can someone please explain this error for me?

#########################

[codebox]global void

testKernel(int* g_odata, unsigned int numSignatures, int candidateLength, int sigLength, char* candidate,char* sigs)

{

// access thread id

unsigned int tid = (blockIdx.x *blockDim.x ) + threadIdx.x;

if(tid >= numSignatures) //better safe then sorry

   return;

for(int i=0;i<candidateLength;i++){

  int j=0;

	

  while(j < sigLength && candidate[i+j] == sigs[(tid*sigLength)+j] && j < candidateLength-i){

       j++;

  }

  if(j == sigLength){

       g_odata[tid]=1;

       break;

  }

}

}[/codebox]

#########################

Cheers,

Alex

That looks pretty innocuous to me. Are you certain that the dimensions of g_odata cannot permit the store to global memory to go out of bounds?

Yes I am sure there is enough memory allocated as the problem is not present when all threads preform the write. As a test I changed the line to “g_odata[0]=1;” and the error still occurs. I only encounter a kernel crash when some threads take the path that leads to the memory being set and some do not. That is to say if all threads find a match there is no error and if all threads do not find a match there is no error.

Thanks for the help :)

“i” is increasing only by 1 for every iteration. Is that ok? (I thought it was like a base-index for a signature)

Got it! No idea why on earth it manifested with that behaviour but the bug was unrelated. I had without noticing been passing the host candiate pointer not the device candiate pointer. I am still interested in how this caused the observed behaviour and if anyone knows that would be great to learn.

Moral of the story if it works in Emu mode you probably passed the wrong pointer.

That would depend on the matching algorithm. In this case I simplified it to brute force to eliminate possible points of error.

Without that statement, the compiler could probably see that the whole loop (including the access to the candidate pointer) didn’t do anything, and so optimised it away, eliminating the candidate read that was causing the problem.

Indeed! I never expected the complier to be quite that good at optimizing.