nvprof fails with TensorRT inference task

g.amardeep · August 20, 2018, 12:00pm

Hi,

I am using nvprof to analyse the compute and data transfer performance of P4 on
GoogleNet pre-trained network .

If I run TensorRT without nvprof the inference engine is generated and I can see the smi and giexec log.

For some reason when i run the same giexec with nvporf options it gets stuck and eventually crashes.

Here are the nvprof options i used :

nvprof --events all --print-gpu-trace --system-profiling on -u ms --log-file my_file_%p.log (followed by giexec command)

Is this an nvprof bug, or a certain configuration flag/plug (to nvprof) that I am missing?

Thanks for your help.

veraj · August 21, 2018, 3:34am

Hi, g.amardeep

This looks like the well-known slowdown caused due to kernel serialization.

Suggest you run the application using nvprof for a couple of minutes and then kill it. You should get usable results.

Topic		Replies	Views
nvprof with tensorflow is suspiciously slow CUDA Programming and Performance	7	1604	January 19, 2019
Record profiled data when ending program with ctrl-c Visual Profiler and nvprof	1	841	April 6, 2020
Understanding Profiling gpu-trace output of Inference - TFTRT TensorRT	3	882	January 30, 2019
Profiling TensorRT network on TX2 - nvprof vs. IProfiler Jetson TX2	3	1574	October 18, 2021
nvprof seems to make inference slower, no tensor cores being used Jetson AGX Xavier	4	1044	October 18, 2021
Application returned non-zero code 12 Profiling Embedded Targets	5	2171	October 8, 2021
Profiling deadloop (replay kernel) with nvprof on deep neural network Visual Profiler and nvprof	8	3404	August 24, 2017
nvprof: Internal profiling error 4277:5 on Tesla P100, but not on GTX 1070 Visual Profiler and nvprof	12	4126	October 12, 2021
Unable to use nvprof on Tx1 R24.2.1 Jetson TX1	4	897	October 18, 2021
Nvprof issue when profiling docker + pytorch / tensorRT on AGX Xavier Jetson AGX Xavier tensorrt , docker , pytorch , deep-learning-profiler , profiling	3	1754	September 17, 2021