loop inside a kernel How many interrations?

Just a simple question.

Is there a limit of iteration number in a loop for one thread in a kernel?


I don’t know if there is a limit of iterations but I’m having some problems regarding how long it takes to finish, when my kernel takes mora than 5-6 sec to finish, it just ends without giving me any error report but I know that something went wrong because it doesn’t do what it’s suposse to do, what I had to do was to implement a kernel with a small loop (around 10000 iterations) that takes less than 1 sec to finish and run this kernel as many times as I need to complete the “real” loop, the problem is that I have to call the same kernel many times which is inefficient.

I’d like to know if there is a way to avoid this restriction since it really slows down the algorithm…


You have almost certainly stumbled into the watchdog timer. If you are running CUDA on a device which is also managing the GUI display, the video driver will terminate kernels which take longer than a few seconds. The only way to avoid this is to run your kernel on a device not managing the display, or if you are running on Linux, you can turn off X.

Having to break your kernel up into several calls should not be terribly inefficient (though annoying to program, I agree) since the overhead on calling a kernel is tens of microseconds, usually.

Ok, thanks mate, I’ll give it a try…