Record profiled data when ending program with ctrl-c

Hi. I’m trying to profile the GPU usage of a TensorRT server-client model.
Here’s what I’m doing:

  1. Run nvprof --profile-all-processes -o results%p.nvvp on one docker container’s terminal.
  2. Start running TensorRT server on a different terminal in the same docker container as 1’s.
    –> Up to this point, 1’s nvprof recognizes there is a process running, since it shows NVPROF is profiling process 920, command: /opt/tensorrtserver/bin/trtserver --model-store=/modelstore --allow-profiling=true --allow-metrics=true --allow-gpu-metrics=true on its terminal.
  3. Start running TensorRT client on a different terminal
    –> 3 works fine as well, because the correct results are shown on its terminal.

Now, when the client request of 3 is finished, it exits normally.
However, since TensorRT server (from 2) is still running. As far as I know, the only way to turn off TensorRT server is by killing it with ctrl-c. However, after this, on 1’s terminal, it reads: ==920== Error: Internal profiling error 4087:35.
I believe this is because I have ended the TensorRT server with ctrl-c. And because of this, when I end nvprof (with ctrl-c, like it tells you to), the result is only a 380KB file, and when opened with nvvp, the file has no information about any timeline whatsoever.

Is there a way to save profiling results when the program is exited abnormally (via ctrl-c)? Or is there any workaround using nvprof with TensorRT servers?

Thanks in advance!

Hi Jinha,

I suspect you ran into the security issue due to which nvprof has to disable the profiling support for non-root users. For instructions on enabling permissions please refer https://developer.nvidia.com/nvidia-development-tools-solutions-ERR_NVGPUCTRPERM-permission-issue-performance-counters. A quick solution is to run as the root.

Are you able to profile any simple application otherwise? Please try without option --profile-all-processes
$nvprof <application>

If none of the solutions work, please provide details of the CUDA toolkit version.