Hi, I was trying to get tensor cores and cuda cores timeline in SM like below from nsight compute, what can I do?
I use ncu command and get a ncu-rep
file, but I found no timeline in nsight compute.
Anyone can help to answer the question?
Hi, I was trying to get tensor cores and cuda cores timeline in SM like below from nsight compute, what can I do?
I use ncu command and get a ncu-rep
file, but I found no timeline in nsight compute.
Anyone can help to answer the question?
Hi, @hyaloids
I’m sorry that there is no such timeline provided in Nsight Compute.
Well, that’s OK. Is there any way to get tensor cores and cuda cores runtime/memory utilization from nsight compute?
Thanks for your reply.
Hi, @hyaloids
Sorry for the late response.
PmSampling section will give you a timeline view of the utilization of the SM and the tensor pipe in particular on GA100 and newer. It can be collected with --section PmSampling
, or as part of the full
set
And SM and Tensor core utilization is also part of the default SpeedOfLight section in the basic
set, with details on pipelines being available in the GPU Throughput Breakdown tables. For more details, other sets contain the ComputeWorkloadAnalysis
section, which details the individual compute pipeline’s utilization.
Using the defintion:
The closest match in hardware for utilization counting the number of cycles the FMA pipes (CUDA cores) and the Tensor Pipe (Tensor Cores) are active.
The FMA Pipe (CC 7.0 - 8.0) or FMA Heavy Pipe (CC >8.0) can execute non-FP32 instructions including FP16 and IMAD.
Given that “Cores” are not equivalent to CPU cores. Given these are instruction pipelines the request for memory utilization for these two “core” types is unclear.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.