Launch failure! Please help! Bug in the Cuda code, where multiple kernels are launched seque

The CUDA code that I have written launches upto 8 kernels sequentially one after the other. But I get a crash at the launch of the fourth kernel, irrespective of which kernel it is. Though there is no instructions at all in the kernel, it still crashes. What might have gone wrong??

Awaiting a quick reply

Because of asynchronous execution, the problem might be in the third kernel. Is the problem also independent of what the third kernel does?

What happens if you run the program with cuda-memcheck?

Are you on Windows? If yes, are you using the TCC driver? How long are the kernels running? The watchdog timer might be triggering after a few seconds, with the purpose of stopping runaway kernels. Under Windows without the TCC driver, kernels get batched and the timeout is for the whole batch, not just of a single kernel. You can place [font=“Courier New”]cudaStreamQuery(0)[/font] between kernel launches to invoke then separately.