The tensorcore usage information is in the output you posted, in the column under the heading half_precision_fu_utilization. The operations that show Idle (0) are not using tensorcore. The operations that show Low(1) are using tensorcore (basically the csrmm operations). That’s exactly where you might expect the usage to show up - in the matrix-matrix multiply ops.
This blog covers some basics, but you seem to be already aware of it. If you drop the --metrics ... switch from your command but keeep the --print-gpu-trace option/switch, you can see the actual kernels that use tensorcore. On V100 they will typically have 1688 in the kernel name. But even if you don’t find that, the metric output is accurate.
It is already publicly stated that cusparse may use tensorcore in some cases. I cannot speak to your case specifically, but as already discussed, the profiler output seems to indicate some tensorcore usage.