What is measured in the (CPU TIME) column of visual profiler output? Is this the time required to launch the kernel? the PCIe delay?
I’m coding an iterative process in which each iteration is mapped to a kernel launch. I’m worried because the GPU TIME and CPU TIME are almost identicall, this mean that the kernel is doing too litle computation?