With the visual profiler, I ran the simple matrixMul example with two metrics: Device memory utilization and System memory utilization. I ran the example with two large matrices where nvidia-smi reported about 2200M out of 4000M memory usage.
After the run, now I see that:
Device memory utilization = Low (2)
System memory utilization = Low (1)
I guess something is wrong here? Maybe the meaning of the utilization is different here. I expect to see Device memory utilization of about Medium (5). Isn’t that?