Can I see each thread status?

Hi, I want to know if I can use Nsight Compute or other profiling tools, so that I can have a clear view, what each thread is actually doing, like being stalled to wait, and so on? Or I can only analyse my kernel function with the data like throughputs and rates in Nsight Compute?
Many thanks!

Hi, @markusxwr

Thanks for using Nsight Compute.
Would you please check if the function provide by “System Trace” activity can meet your demand ?

@veraj response was for OS threads.

Nsight Compute does not support per thread stalls or status. Nsight Compute does support statistical sampling of warp state and program counter. This is displayed in the Source View “Sampling” columns. The warp stall reason is for the set of active threads in the warp that are actively being scheduled (in the case of divergence). The sampler does not provide the warp active mask for each sample.

Is there a specific performance use case that you are trying to debug that is not supported through the current statistical sampling? For example, are you trying to debug performance issues with high thread execution divergence.

Hi Greg,
Understood. I am not debugging performance issues with high thread execution divergence. I have designed to make them do the same work.
So I can only analyse my code with these statistical sampling, and find what the cause of latency is, right?

Correct. The best approach in the current tool is to rely on the statistical sampling in the Source View.

It is also possible to add timing in the kernel through reading of inline PTX special register %clock. If you read these into a few variables and write-out at the end of the kernel it is possible to get a better understanding of overhead. When using this approach always review the generated SASS disassembly to make sure the compiler did not move the clock read.

Understood. Thanks a lot!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.