Concurrent kernel execution

Is there any way to get information about concurrent kernel execution of a device? Problem with profiling is that it serializes the kernels and shows some statistics. However, in reality, it is possible to start executing two kernels concurrently and switch to each other.

For example, if we look at stream or context numbers of kernels extracted by the profiler, is there any hope to get information about the concurrency?

Hi, @mahmood.nt

Please check if Nsight Systems | NVIDIA Developer meets your demand

Nsight Systems does not serialize kernels.
In case of concurrent kernels you can see the overlap on the Nsight Systems timeline.

So, which section do you mean exactly? As you can see below the timeline doesn’t bring up the kernels. I zoomed into timeline but didn’t see kernel names.