kernel function exits after __syncthreads


This is my first time posting to this forum. So be gentle :) .

GPU: 8800 GTX.

I am writing a kernel function to perform wavelet transformation via lifting. I have to use multiple __syncthreads() (one after each lifting step).

When I run it in emulation mode and with gdb, I am finding that the code up to the first __syncthreads() gets executed. After that, the program exits the kernel function without executing remaining lines of code (including additional __syncthreads() statements). I have verified this in the debugger as well as by inserting some printf statements after the first __syncthreads() statement.

The __syncthreads() statement is NOT within any kind of conditional loop, although there are conditional statements preceding __syncthreads().

Any help will be greatly appreciated!



I think this has to be with conditional syncthreads – although u say its not the case. Please scrutinize the code well

OR else,

Reduce the code to bare minimal such that the problem is re-created and then either scrutinize to find the problem OR post that bare minimal code here. We can check.

I have to say look out with checking with printf() we also did that but after a while we discovered that when we use printf statements the kernel can stall on a different part of the kernel than you think. This because writing to the console is way slower than computing.

So be aware when using printf to check whether you are past some point.

And welcome!

Thanks to all for helpful replies.

I figured out what the problem was. I was accessing an out-of-bound array location in shared memory. This error was manifesting itself as exit after __syncthreads. I don’t know exactly why. But, fixing the out-of-bounds error fixed the __syncthreads problem.