When using Nsight Compute, are more than two kernels profiled separately or concurrently?

Let say a process has two kernels and launch them almost concurrently. When using Nsight Compute(ncu) profiles this process, does ncu execute two kernels separately and get relevant metric(e.g. dram__bytes), or just concurrently?

By default, profiled workloads are isolated and serialized for reproducibility. You can use range-based data collection modes to profile ranges of potentially concurrently executing kernels.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.