Output of different profilingVerbosities is same


I did some experiments to figure out how to use profilingVerbosity. Its said here: TensorRT/tools/Polygraphy/examples/cli/inspect/02_inspecting_a_tensorrt_engine at main · NVIDIA/TensorRT · GitHub that

NOTE: --show layers only works if the engine was built with a profiling_verbosity other than NONE . Higher verbosities make more per-layer information available.

I used an export script for experiments and added builderConfig->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kNONE);
When not using this line it should default to ProfilingVerbosity::kDefault, which is LAYERS_ONLY.

However, I receive the same output of poloygraphy inspect model <path> --show layers no matter what verbosity level I select:

[I] Loading plugin library: …/plugin
[I] Loading bytes from …/experiments/model.plan
[I] ==== TensorRT Engine ====
Name: Unnamed Network 0 | Explicit Batch Engine

---- 1 Engine Input(s) ----
{in1 [dtype=float32, shape=(-1, 3, 1000, 1000)]}

---- 2 Engine Output(s) ----
{out1 [dtype=float32, shape=(-1, 10, 1, 1)],
 out2 [dtype=float32, shape=(-1, 60, 1, 1)]}

---- Memory ----
Device Memory: 27622800 bytes

---- 1 Profile(s) (4 Binding(s) Each) ----
- Profile: 0
    Binding Index: 0 (Input)  [Name: in1_1] | Shapes: min=(1, 3, 1000, 1000), opt=(8, 3, 1000, 1000), max=(12, 3, 1000, 1000)
    Binding Index: 1 (Output) [Name: out1]  | Shape: (-1, 10, 1, 1)]
    Binding Index: 2 (Output) [Name: out2]   | Shape: (-1, 60, 1, 1)]

---- 210 Layer(s) ----

My questions are:

  • Is this intended? Is the shown information the at least required information of tensorrt to use the model?
  • When delivering a model to a client, or uploading a model to the cloud, I’d like to make it as hard as possible to reverse engineer the architecture or the model. Therefore it would be nice to not show layer information. When using profilingVerbosity::kNONE, which information can be reverted exactly?


TensorRT Version: 8.0


Have you tried kDETAILED level?

Thank you.

We recommend you to raise this query in TRITON Inference Server Github instance issues section.


How is this a triton server issue? The model is trt, the tool is in the trt repository.

I’d like to reveal as little information as possible. So this question is: What information is revealed when using kNone? The whole architecture, the weights, etc.?

Are there any updates on this issue?


Unfortunately, even with kNONE, the engine binding names, binding shapes, and the number of layers are still visible because that information is needed to run an inference with a TRT engine. Hiding binding shapes and binding names may impossible.
The main difference between kNONE and kLAYER_NAMES_ONLY is that the layer names are hidden.

Thank you.