I have setup docker on my DGX machine and have been using multiple docker sessions for different tasks.
I want to evaluate the GPU usage per docker session for each user. How can I do it? nvidia-smi is only helpful for overall usage statistics. The login is similar for each user and per docker session I need to evaluate how much consumption is there. Could you please suggest some solution?
It sounds like you and your team are at the point where manually doing a docker run ... is going to mean increasingly more pain - e.g., if you want to monitor usage, jobs, etc. there are easier ways.