Weird behavior in kernel calls. Related to asynchronous & synchronous instructions

Hello, I’ve got a really weird problem with a kernel call

A main kernel “CompKernel” returns “unspecified launch failure” error only if a previous kernel “DebugKernel” is NOT called.
The weird thing is that although DebugKernel really does nothing ( if(tid < n){ no instructions }, the calling to DebugKernel influences in some way the computations of the next kernel CompKernel, causing that CompKernel returns correct results only if DebugKernel is called.

I know the kernel calls are asynchronous, and after every kernel I put these instructions to debug and to ensure synchronous kernel calls, but the problem remains.
checkCUDAError(“name of the kernel”);

Now I don’t really now if the “cudaThreadSynchronize()” instruction is enough to prevent asynchronous kernel executions. Any other instruction to prevent this behavior?

The GPU is a GeForce GTX 295 with compute capability 1.3.

I appreciate any comment. Thanks in advance