Filter Problem (the launch timed out and was terminated)

HiRez · December 15, 2009, 3:57pm

Greetings,

I have just started working on a quadratic filter to be applied to a 16 bits/channel image.
Currently, all I am interested in is to apply the filter definition without resorting to svd or fft.

I successively launch my kernel 3 times; one for each channel.

I am not using shared memory, just global and locals.

When I set the filter dimension (the size of the square around the pixel in question, whose values are used in the computation) beyond a certain value, first launch (for red channel) does not give any errors but does not return anything either. The other two launches (green and blue) give the “the launch timed out and was terminated” error.

Is there a built-in watchdog timer mechanism in cuda that is preventing proper execution and completion of my first launch ?

Suggestions are welcome and appreciated.

Regards

eyalhir74 · December 15, 2009, 4:02pm

Its actually a watchdog defined by windows which should be 5 seconds.

Your kernel probably has an infinite loop in it (or its very very complex and in that case you should break it into

a few kernels) - probably has an infinite loop though :)

As for the watch-dog just search the forum you’ll see a lot of references about this.

eyal

HiRez · December 15, 2009, 4:28pm

It cannot be an infinite loop:
a- The same code with the same data set runs properly in emu-debug, very slowly of course.
b- The same code with a smaller data set runs properly on the GPU.
It must be hitting the 5 second limit, then.

Regards

eyalhir74 · December 15, 2009, 4:42pm

Thats not correct - imagine if you have a loop in the kernel and the loop iteration count is calculated in your

kernel. In emu-debug, the calculation might be correct but when you run in release the value turns out

to be garbage or very large.

In general the fact that a code runs perfectly with emulation doesnt mean it runs properly in release on the GPU.

Smaller data sets might work fine because you’re not going out of bounds reading/writing garbage values

affecting the kernel time.

Maybe you can post the kernel code, or try to pin-point the problem yourself by commenting portions of your

code till you find the offending code.

hope that helps.

eyal

HiRez · December 15, 2009, 5:46pm

Ran the kernel as <<<1,1>>> with an, otherwise offending, filter dimension (i.e. applied the filter to pixel 0 only).
The loops in the code are controlled by a function argument, the filter dimension (or tap), not derived from blockIdx or threadIdx values.
cutxxxx() reported execution time was less than 0.065 ms.
I am now suspecting a severe case of serialization.

Topic		Replies	Views
Fatal error:the launch timed out and was terminated CUDA Programming and Performance	5	9865	April 19, 2016
Error on iteration of cuda kernel CUDA Programming and Performance	4	4403	July 11, 2011
Error in lunching a kernel "the launch timed out and was terminated" CUDA Programming and Performance	1	935	April 13, 2011
Kernel problem, execution stop after ~15min CUDA Programming and Performance	7	1895	November 4, 2016
the launch timed out and was to terminated? CUDA Programming and Performance	3	1206	April 3, 2009
Inconsistent results watchdog issues? CUDA Programming and Performance	6	2845	March 19, 2008
cudaErrorLaunchFailure without any apparent occurrence pattern? CUDA Programming and Performance	6	1393	November 10, 2016
question about "launch timed out" CUDA Programming and Performance	2	1447	April 24, 2009
CUDA limit for loops..? too large number of iterations? CUDA Programming and Performance	28	27669	March 20, 2008
Another launch timeout... CUDA Programming and Performance	2	1953	October 16, 2008

Filter Problem (the launch timed out and was terminated)

Related topics