Concurrent execution of CUDA and Tensor cores

The profilers can show you the relative loading of various pipes, such as math and tensor core. Please ask profiler questions on the appropriate profiler forum. The profilers have command-line operations possible.

If by “NVIDIA tools” you mean nvidia-smi, the behavior is described here. It doesn’t refer to either CUDA or Tensor cores.

You can always request enhancements to the CUDA ecosystem by filing a bug.

1 Like