Another crash

Hey there,
I’m wondering, is there something essentially wrong with the following code:

result = 0;
for(int i = 0; i < 10000; i++)
for(int j = 0; j < 10000; j++)
result = i0.01 + j0.01;

Console error output: Unspecified launch failure

The code executes fine on the GPU if i put 1000 instead of 10000 at the loop conditions but crashes with the given values. This is basically my very first cuda program and i was mainly just trying to test random constructs. So don’t mind about the usefulness of the given code.
In case of a not so obvious problem it would be kind to point me to the given section that might explain the problem in the programming guide.
thanks in advance.

If result is a float variable and is written back to device mem (to avoid compiler optimizations), there is nothing wrong with the code. To make it clearer to the compiler, you should do explicit float conversions for i and j however.

How long does the kernel run? You are aware of the 5 sec limit in case you use the G80 also for the desktop?


Umm, no idea why this shouldn’t work. Maybe you could have a look at the generated PTX code?


Actually it does run around 5 seconds before crashing. And there is just this single G80 installed in the computer so it’s definately also used for the desktop display. Could you explain this a little further or give a link?

Thats the problem. In order to avoid 5 second limitation, you need 2 (nvidia) cards. One for display and one for cuda. Timeout watchers time out when a cuda program runs for more than 5 seconds without returning.

See the FAQ about the “watchdog timer”.