per-process resource accounting

I’d like to monitor per-process usage of the GPUs we’re using. I wrote a simple C++ program using OpenCV 3 which is using the OpenCL libraries for background subtraction in video. The tool decodes H264 video (on the CPU) and then runs background subtraction (in GPU). I’d like to run multiple such tasks concurrently (in separate processes) on the GPU and measure their use. I’m doing this on Windows on GeForce GTX 980Ti. I use “c:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe -l 1 -q -d ACCOUNTING” to monitor per-process usage.

When I start a single copy of the task, I see the following usage (and CPU at 25%).

Process ID                  : 10196
            GPU Utilization         : 25 %
            Memory Utilization      : 8 %
            Max memory usage        : 0 MiB
            Time                    : 15059 ms
            Is Running              : 0

This seems reasonable. However, when I start 2 concurrent copies of the same task, I see the following (CPU at 50%):

Process ID                  : 5636
           GPU Utilization         : 45 %
           Memory Utilization      : 16 %
           Max memory usage        : 0 MiB
           Time                    : 17689 ms
           Is Running              : 0
       Process ID                  : 8200
           GPU Utilization         : 43 %
           Memory Utilization      : 15 %
           Max memory usage        : 0 MiB
           Time                    : 17669 ms
           Is Running              : 0

When I start 4, I get this (CPU at 100%):

Process ID                  : 13592
            GPU Utilization         : 52 %
            Memory Utilization      : 19 %
            Max memory usage        : 0 MiB
            Time                    : 27207 ms
            Is Running              : 0
        Process ID                  : 14944
            GPU Utilization         : 54 %
            Memory Utilization      : 20 %
            Max memory usage        : 0 MiB
            Time                    : 27082 ms
            Is Running              : 0
        Process ID                  : 11492
            GPU Utilization         : 55 %
            Memory Utilization      : 20 %
            Max memory usage        : 0 MiB
            Time                    : 26987 ms
            Is Running              : 0
        Process ID                  : 7440
            GPU Utilization         : 55 %
            Memory Utilization      : 20 %
            Max memory usage        : 0 MiB
            Time                    : 26949 ms
            Is Running              : 0

It seems that nvidia-smi.exe is not really breaking the utilization by process; I’m running the same task so it’s average GPU utilization shouldn’t go up from 25% to 55%. And 4 * 55% > 100% so this doesn’t seem to make sense (because they were all running concurrently).

Am I missing something about this tool? I’d like to get per-process utilization numbers like I get for CPU using Windows resource monitor.

FYI, this is the info I get by running SMI by itself:

c:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe
Wed Oct 28 14:35:50 2015
+------------------------------------------------------+
| NVIDIA-SMI 358.50     Driver Version: 358.50         |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti WDDM  | 0000:05:00.0      On |                  N/A |
|  0%   43C    P8    17W / 250W |    953MiB /  6144MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0         4  C+G   Insufficient Permissions                     N/A      |
|    0       604  C+G   Insufficient Permissions                     N/A      |
|    0      5776  C+G   ...iles (x86)\Internet Explorer\IEXPLORE.EXE N/A      |
|    0      6556  C+G   Insufficient Permissions                     N/A      |
|    0      7068  C+G   ...icrosoft Office 15\root\office15\lync.exe N/A      |
|    0      7276  C+G   ...rosoft Office 15\root\office15\groove.exe N/A      |
|    0      9304  C+G   ...osoft Office 15\Root\Office15\WINWORD.EXE N/A      |
|    0     10608  C+G   ...osoft Office 15\root\office15\ONENOTE.EXE N/A      |
|    0     10620  C+G   ...iles (x86)\Internet Explorer\IEXPLORE.EXE N/A      |
|    0     10680  C+G   ...osoft Office 15\root\office15\OUTLOOK.EXE N/A      |
|    0     11432  C+G   C:\Program Files (x86)\VideoLAN\VLC\vlc.exe  N/A      |
|    0     12320  C+G   ...iles (x86)\Internet Explorer\IEXPLORE.EXE N/A      |
|    0     12980  C+G   ...ce 15\Root\Office15\NAMECONTROLSERVER.EXE N/A      |
|    0     13972  C+G   ...crosoft Office 15\root\office15\EXCEL.EXE N/A      |
+-----------------------------------------------------------------------------+

Thanks,
Peter

1 Like

Dear Peter,

it is quite sad that your request has not been answered for over 7 years. I was wondering if you could find any proper solution because I have exactly the same requirement - and I am very much surprised that all the nvidia tools are not able to provide a GPU mem used per PID although it seems to aggreate them itsself:

“memory.used”
Total memory allocated by active contexts.

I will start another thread basically asking for the same solution on Linux and no custom application involved. I would assume, nvidia should be happy to provide a solution for such expensive machines owned by their customers.

Best regards, Ron

You’re imagining that that when process A is using the GPU, process B is not, and furthermore that nvidia-smi will accurately convey this.

Niether of those statements are true.

I don’t think nvidia-smi is the tool you want, I’m not sure what you’re asking for is achievable or even makes sense, and I don’t have suggestions for any tools that do that (unless you want to use a profiler).

Just because process A is using the GPU does not mean that process B is not.

In addition, nvidia-smi is a sampled measurement. It’s not guaranteed to give precise accounting.

Again, I have no suggestions for you.

If this were a request about CUDA instead of OpenCL, and furthermore if this were a datacenter GPU, the DCGM tool may be of interest. It still may not meet the objectives, I’m not sure the objectives are achievable.

With CPU tools like top, I can easily run into a situation where a utilization number greater than 100% is observed. I’m sure that someone will come along and tell me that is not a valid analogy. At any rate, the nvidia-smi tool does not behave this way, by design.