How to monitor SM utilization and SM occupancy?

nvidia-smigives volatile GPU util. which is useful if you want to know if the GPU is being used or not. It gives the amount of time a kernel was running on the GPU during a sampling interval.
nvidia-smi dmon gives an sm%. I do not understand what it means exactly.

# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     %     %     %     %   MHz   MHz
    0    43    48     0     1     0     0  3505   936
    0    43    48     0     1     0     0  3505   936

I was hoping to know how this sm% number was calculated.

NVML, on the other hand, gives the current and max sm clock frequency. Is there a way to get the SM utilization percentage similar to CPU utilization% to measure how well a program uses the SM cores.

Also, how does one go about monitoring the SM occupancy?

From my understanding of Nvidia GPUs, it looks like to fully maximize utilization, one needs to occupy all the SMs and run the maximum number of instructions possible.
Can the NVML API be used to get this information?

1 Like