nvidia-smi offers a
memory.used metric that outputs “Total memory allocated by active contexts”. We are interested in tracking this but for a partition of a MIG (i.e., A100 40GB with a
1g.5gb profile). The supported metrics by DCGM include Memory BW Utilization, but it is different.
Is there a metric (or possibility) of tracking used memory in DCGM for a partition?