unspecified launch failure only when break from a for loop

whitewatercn · July 14, 2009, 7:27am

my kernel has a for-loop

[codebox]

for(int i=0;i<rows;i++)

{

x=…

…

if(x==0) break;

}

[/codebox]

Although divergent code is not suggested by CUDA, but the break statement achieves 2X speedup in my case.

The program works fine for many data sets for many days. however, recently it reported a “unspecified launch failure” error for some data set.

The most strange thing is, if I comment out the “if(x==0) break” line, the program will run correctly.

So what is the possible reason ?

Thanks

sigismondo · July 14, 2009, 7:35am

just a try… could x be evaluated to NaN because of some specific data set and give origin to such a behaviour? I really have never investigated what is the CUDA behavior with NaNs.

whitewatercn · July 14, 2009, 7:54am

Thanks for the reply.

The x is read from a array in global memory, which are all assigned to positive integers or 0. NaN is impossible.

jgoffeney · July 14, 2009, 1:48pm

Rather than using break try adding the condition to your for loop like:

[codebox]for(int i = 0; i < rows && x != 0; i++)[/codebox]

I had the same problem and this seemed to fix it.

whitewatercn · July 15, 2009, 1:04am

Thanks for the suggestion, I tried it, unfortunately the problem is still there.

Nico · July 15, 2009, 7:28am

Are you using a synchronization barrier in the code? If so, making just the one thread exit can keep the sync from happening.

N.

BeachHut · July 15, 2009, 8:39am

Try using CUDA-gdb, or running in device emulation mode to identify where the problem is being created, or to check the values of x.

whitewatercn · July 20, 2009, 1:54am

No, I don’t use __syncthreads() in the loop.

whitewatercn · July 20, 2009, 2:04am

The key is that : if I comment out the if…break line, the program will run correctly and give right results.

and the x is use in other lines of code, without any problem.

So the data should be OK.

I think the problem is come from program logic or compilier side.