Dynamically Canceling a Block: Deadlocks Kernel randomly freezes

I have a kernel that takes, as an input, an (int) array. If the value of this array is 0, I want to cancel the entire thread block. Generally, about 70% of the blocks can be cancelled (saving a huge amount of time)

__global__ void SampleKernel(int* instructions, ----- other stuff----)

{

const unsigned int bx = blockIdx.x;

if(instructions[bx]==0)

{

return;

}

*do lots of work otherwise*

__syncthreads()

*tree reduction*

__syncthreads()

*if threadIdx.x==0, write to global memory*

}

The problem is that this kernel will occassionally hang indefinitely. Sometimes it will freeze on the 80th iteration. Sometimes on the 5th.

I’m unable to really determine the error, though. Any form of switching, based on the if(instructions[bx]) seems to have this behavior.

For example,

__global__ void SampleKernel(int* instructions, ----- other stuff----)

{

const unsigned int bx = blockIdx.x;

if(instructions[bx]!=0)

{

*do lots of work otherwise*

}

__syncthreads()

*tree reduction*

__syncthreads()

*if threadIdx.x==0, write to global memory*

}

This will also occassionally freeze.

Any ideas why? Perhaps there’s a better way to cancel an entire thread block?

And yes–I could generate a more complicated list of intructions to get around the block-cancelling, but that takes significantly more computation time to do.

(Using a Tesla card on an EVGA NVIDIA 780i SLi motherboard, running CUDA 1.1 on Fedora Core 8.)

From the 1.1 Programming guide:

This is clearly true for my application…thus my concern.