This should be a simple issue, but it’s got me stumped. I’ve got an array of structs, and a set of inputs I’m trying to match. So I have each thread in the block look at its assigned struct elements to find a match. When a thread does, it alone gets to set the shared index variable. Then they meet up at a syncthreads. Then after all threads know which index it is via the shared index variable. But…it’s not working all the time. Some threads in the block are processing code after syncthreads while other threads in the block haven’t yet reached syncthreads.
The code is here:
__shared__ int index;
int i = threadID;
index = -1;
__syncthreads();
while(i<d_numItems){
int strmatch=0;
char const *s1 = label; //reset s1 and s2 back to the start
char const *s2 = &(d_varDB[i].label[0]);
//a one-line strcmp. This should keep branching down to a minimum.
while (!(strmatch = *(unsigned char *) s1 - *(unsigned char *) s2) && *s1++ && *s2++);
//One and only one entry in d_varDB has these matching values.
if (strmatch == 0 && d_varDB[i].domainID == patchID && d_varDB[i].matlIndex == matlIndex) {
index = i; //This thread found it.
}
i = i + numThreads; //Try the next section.
}
//sync before return;
__syncthreads();
if (index == -1) {
printf("Error, didn't find anything for %s patch %d matl %d with threadID %d and numthreads %d\n", label, patchID, matlIndex, threadID, numThreads);
return NULL;
}
else return &d_varDB[index];
Some threads in the block are hitting the if (index == -1) and gets into the error handling code, while other threads in the block who should have found the index haven’t got that far yet.
I’ve always read that you shouldn’t put syncthreads inside conditions. But that you can put conditions between syncthreads. Am I mistaken on this? Or is it something stupid and obvious and I just can’t see it. :)