How to confirm Tensor Core is working or not in CuSPARSE

Rookie_programmer · April 28, 2023, 9:27am

Hi Community member,

I use FP16 in cusparseSpMM() function, and use nvprof tool to analysis performance.

root@xxxxx:/home/ubuntu/csr_mm_cuda# nvprof --print-gpu-trace --metrics half_precision_fu_utilization ./nvidia-spmm.out 2
==1091== NVPROF is profiling process 1091, command: ./nvidia-spmm.out 2
0.456082
warm up
0.362930
==1091== Profiling application: ./nvidia-spmm.out 2
==1091== Profiling result:
         Device   Context    Stream                Kernel  half_precision_fu_utilization

Tesla V100S-PCI         1         7  void cusparse::matri                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::parti                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::csrmm                        Low (1)
Tesla V100S-PCI         1         7  void cusparse::matri                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::parti                       Idle (0)
Tesla V100S-PCI         1         7  void cusparse::csrmm                        Low (1)

But i can’t get any tensor core information.
Would you please teach me the way to confirm whether the Tensor core is working or not in CuSPARSE.

Robert_Crovella · April 28, 2023, 2:19pm

The tensorcore usage information is in the output you posted, in the column under the heading half_precision_fu_utilization. The operations that show Idle (0) are not using tensorcore. The operations that show Low(1) are using tensorcore (basically the csrmm operations). That’s exactly where you might expect the usage to show up - in the matrix-matrix multiply ops.

This blog covers some basics, but you seem to be already aware of it. If you drop the --metrics ... switch from your command but keeep the --print-gpu-trace option/switch, you can see the actual kernels that use tensorcore. On V100 they will typically have 1688 in the kernel name. But even if you don’t find that, the metric output is accurate.

Rookie_programmer · April 28, 2023, 2:35pm

Thank you!

So even in CuSPARSE, tensor core also used automatically?

Robert_Crovella · April 28, 2023, 3:22pm

It is already publicly stated that cusparse may use tensorcore in some cases. I cannot speak to your case specifically, but as already discussed, the profiler output seems to indicate some tensorcore usage.

system · May 12, 2023, 3:22pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to confirm whether Tensor Core is working or not. Jetson AGX Xavier	8	10930	October 18, 2021
Nsight Profile of NVIDIA/CUDALibrarySamples/cuTENSOR. Does it use CUDA Programming and Performance	4	516	November 22, 2022
How to test if tensor cores are working? (CMP 100-210) CUDA Programming and Performance	13	1655	April 11, 2024
How to enable Tensor core for cublasSgemmBatched on H100? GPU-Accelerated Libraries cuda , kernel , cublas , cutlass	5	835	November 17, 2023
How to make tensor cores work? Frameworks cuda , pytorch	2	837	May 18, 2023
TensorRT consumed 100% CPU core with 6MiB output data. DeepStream SDK	7	1958	March 26, 2018
Can you use nsight to see tensor core occupancy? Nsight Compute cudnn	4	983	March 23, 2024
Is there a way to see if CUDA API execution happened on Tensor Cores or not? CUDA Programming and Performance	4	929	September 18, 2018
How can I get the utilization of cuda core and tensor core respectively? Profiling Linux Targets	5	3110	January 10, 2023
How to measure Tensor core utilization using NVIDIA profiling tools such as Nsight System, DLProf, nvprof etc TensorRT cudnn	4	1571	January 31, 2024

How to confirm Tensor Core is working or not in CuSPARSE

Related topics