How to confirm Tensor Core is working or not in CuSPARSE

Hi Community member,

I use FP16 in cusparseSpMM() function, and use nvprof tool to analysis performance.

root@xxxxx:/home/ubuntu/csr_mm_cuda# nvprof --print-gpu-trace --metrics half_precision_fu_utilization ./nvidia-spmm.out 2
==1091== NVPROF is profiling process 1091, command: ./nvidia-spmm.out 2
0.456082
warm up
0.362930
==1091== Profiling application: ./nvidia-spmm.out 2
==1091== Profiling result:
         Device   Context    Stream                Kernel  half_precision_fu_utilization

Tesla V100S-PCI         1         7  void cusparse::matri                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::parti                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::csrmm                        Low (1)
Tesla V100S-PCI         1         7  void cusparse::matri                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::parti                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::csrmm                        Low (1)


But i can’t get any tensor core information.
Would you please teach me the way to confirm whether the Tensor core is working or not in CuSPARSE.

The tensorcore usage information is in the output you posted, in the column under the heading half_precision_fu_utilization. The operations that show Idle (0) are not using tensorcore. The operations that show Low(1) are using tensorcore (basically the csrmm operations). That’s exactly where you might expect the usage to show up - in the matrix-matrix multiply ops.

This blog covers some basics, but you seem to be already aware of it. If you drop the --metrics ... switch from your command but keeep the --print-gpu-trace option/switch, you can see the actual kernels that use tensorcore. On V100 they will typically have 1688 in the kernel name. But even if you don’t find that, the metric output is accurate.

Thank you!

So even in CuSPARSE, tensor core also used automatically?

It is already publicly stated that cusparse may use tensorcore in some cases. I cannot speak to your case specifically, but as already discussed, the profiler output seems to indicate some tensorcore usage.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.