I am new to Cuda.I have a problem with my code: I suspect to have some kind of problem related to the asynchronous kernel execution of two new kernel in my code.
Reading the Cuda Design Guide (rf.220.127.116.11, v.July 2013):
“Kernel launches are synchronous in the following cases:
a-The application is run via a debugger or memory checker (cuda-gdb,cuda-memcheck,Nsight) on a device of compute capability 1.x;
b-Hardware counters are collected via a profiler (Nsight, Visual Profiler”
is (a) true only for device of compute capability 1.x?
I am working with a dev of cc 2.x. If I debug with cuda-memcheck, are the kernel executed in synchronous or asynchronous way?
Many thanks for any help!