Are there any reasons the memory usage of the process is underestimated with nvidia-smi? I got similar results using nvprof - 1.6GB shows up in the process and in nvprof, but globally I see 9.3GB being used. Estimating the memory indicates that it should be around 9-10GB.
I’ve reset the gpu through nvidia-smi --reset before running the experiment.
I now think that the global counter includes reservations/claims of the processes and the usage of the process is only the actual usage. Under this hypothesis: The output of nvprof was a bit of a red herring that it had the same values, leading me to believe the output was suspect as well.