The profilers can show you the relative loading of various pipes, such as math and tensor core. Please ask profiler questions on the appropriate profiler forum. The profilers have command-line operations possible.
If by “NVIDIA tools” you mean nvidia-smi
, the behavior is described here. It doesn’t refer to either CUDA or Tensor cores.
You can always request enhancements to the CUDA ecosystem by filing a bug.