why does this code section doesn't work? four loop nested

this is a global function below;when the code section is not commented,the whole kernel seems didn’t work at all,including former code. All threads seem did nothing at all before exit. It is quite strange

the following code implements binary search with one block N times

[codebox]

global function

{

… … former… …

for (i = 1;i < N;i++)

{

for (eleIndex = threadIdx.x;eleIndex < size;eleIndex += blockDim.x)

{

left = 0;right = long_size - 1;

        while(left<=right)

        {

	middle=(left+right)/2;

	q =somevalue;

	if (p==q) 

	{

		d_global_array[eleIndex] = 1;  

		break;

	}

	if (p>q) left=middle+1;

	else right=middle-1; 

					

     }

}

}

… …later … …

}

[/codebox]

You know you’re gonna get a horrible race condition if you launch more than one block of this, right?

Also, if you’re going to launch just one block, I can almost guarantee you’d be much better of doing a naive linear search where each thread tries one element and if it’s the one we’re searching for, writes its index somewhere.