TensorRT profiling in Python

Hello, I am trying to profile a model that runs using TensorRT in order to see time spent on each layer.

However, I was not able to set up a profiler in Python. In both cases, the object does get initialized. In ConsoleProfiler case it crashes during the inference (SegFault). In Profilers case, it crashes when I use context.set_profile() method and error trace is attached below.

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/topics/topics/pkg_ref/infer.html#id2 That’s the whole relevant documentation I found.

What I tried:
inheritting from ConsoleProfiler

class MyProfiler(trt.infer.ConsoleProfiler):
    def __init__(self, log_level):
        super(MyProfiler, self).__init__(log_level)

    def report_layer_time(layer, ms):

initializing like so

my_prof = MyProfiler(trt.infer.LogSeverity.ERROR)

I am getting segmentation faults when using inhereting from ConsoleProfiler.

Inheritting directly from Profiler

class MyProfiler(trt.infer.Profiler):
    def __init__(self):
        super(MyProfiler, self).__init__()

    def report_layer_time(layer, ms):

    def __report_layer_time(layer, ms):

initializing like so

my_prof = MyProfiler()

In this case I get the following error:

TypeError: Swig detected a memory leak in type 'MyProfiler': no callable destructor found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "benchmark.py", line 367, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/justas/miniconda3/envs/batch35/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 124, in run
  File "benchmark.py", line 313, in main
    prepare_benchmark(framework, model)
  File "benchmark.py", line 299, in prepare_benchmark
    run_inference = trt.get_inference_handle(model, FLAGS.batch_size)
SystemError: PyEval_EvalFrameEx returned a result with an error set

I tried using exit in Python to no avail.

Is it possible to do layer-by-layer profiling using TensorRT Python API?

Take a look how Profiler implemented in googlenet.py in tensorrt examples folder.

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.


Please make the profile globally defined. That should bypass the SWIG memory issue while we work on a solution for 4.0 GA.