How to make the threads idle? Histogramming application

Hi guys,

I will try to be as specific as possible. I need to obtain a color histogram of an image. To do so, I have a window of 15x15 (constant, cannot change it). In other words, every thread in my application is going to move that 15x15 = 225 pixels in the original image to the shared memory, and do the color histogramming in the shared memory. I also do some r,g,b to H,S,V color conversion in the shared memory. Due to the size of the image, it seems I need 71x47 threads. 71 in the horizontal and 47 in vertical. So I set the block size to 16x16. Using this block size, I have to have a grid dimension of 5x5 blocks, which resuls in 80x80 threads and that is larger than 71x47. Finally, here is my question:

How can I make the excessive threads idle? For example, would this be a legitimate code if I put this at the beginning of my kernel:

if (blockID.x + threadID.x > 71) && (blockID.y + threadID.y > 47)




This basically says, if the thread number in the horizontal or vertical direction is bigger than some number, just return and do not do anything.

The reason that I am asking is somewhere in code makes my GPU hang. I thought that I might have a programming model flaw, which I am suspicious about this part in my code. Also, if you think there is a problem in my coding logic (block or grid sizes,inefficient computing) would you let me know?

Thanks in advance for the reply.

Yes you can exit any thread any time.

There is one exception: The early-out may not skip any __syncthreads() call as the other threads will then wait forever for the one that has already exited. This also applies to any other flow control structure. So if you have a sync inside an if() statement, you need to be sure that either all threads take the if-branch or all take the else-branch. Same for for-loops, all need to do same number of iterations if there is a sync inside.