How to get file profile.metadata.json to use TRT engine explorer?

johnminho · May 30, 2023, 10:23am

I generated .engine model from .onnx by this tool TensorRT-For-YOLO-Series/export.py at main · Linaom1214/TensorRT-For-YOLO-Series · GitHub. This repo uses tensorrt, not trtexec to generate .engine model. From the generated .engine, I can generate 2 json files: profile.json and graph.json by command:

/usr/src/tensorrt/bin/trtexec --loadEngine=model.engine --exportProfile=profile.json --exportLayerInfo=graph.json

This is an output of the above command:

ynamic_batch_INT8.engine --exportProfile=profile.json --exportLayerInfo=graph.json
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --loadEngine=yolov7_NOT_dynamic_batch_INT8.engine --exportProfile=profile.json --exportLayerInfo=graph.json
[05/30/2023-10:20:41] [I] === Model Options ===
[05/30/2023-10:20:41] [I] Format: *
[05/30/2023-10:20:41] [I] Model: 
[05/30/2023-10:20:41] [I] Output:
[05/30/2023-10:20:41] [I] === Build Options ===
[05/30/2023-10:20:41] [I] Max batch: 1
[05/30/2023-10:20:41] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[05/30/2023-10:20:41] [I] minTiming: 1
[05/30/2023-10:20:41] [I] avgTiming: 8
[05/30/2023-10:20:41] [I] Precision: FP32
[05/30/2023-10:20:41] [I] LayerPrecisions: 
[05/30/2023-10:20:41] [I] Calibration: 
[05/30/2023-10:20:41] [I] Refit: Disabled
[05/30/2023-10:20:41] [I] Sparsity: Disabled
[05/30/2023-10:20:41] [I] Safe mode: Disabled
[05/30/2023-10:20:41] [I] DirectIO mode: Disabled
[05/30/2023-10:20:41] [I] Restricted mode: Disabled
[05/30/2023-10:20:41] [I] Build only: Disabled
[05/30/2023-10:20:41] [I] Save engine: 
[05/30/2023-10:20:41] [I] Load engine: yolov7_NOT_dynamic_batch_INT8.engine
[05/30/2023-10:20:41] [I] Profiling verbosity: 0
[05/30/2023-10:20:41] [I] Tactic sources: Using default tactic sources
[05/30/2023-10:20:41] [I] timingCacheMode: local
[05/30/2023-10:20:41] [I] timingCacheFile: 
[05/30/2023-10:20:41] [I] Heuristic: Disabled
[05/30/2023-10:20:41] [I] Preview Features: Use default preview flags.
[05/30/2023-10:20:41] [I] Input(s)s format: fp32:CHW
[05/30/2023-10:20:41] [I] Output(s)s format: fp32:CHW
[05/30/2023-10:20:41] [I] Input build shapes: model
[05/30/2023-10:20:41] [I] Input calibration shapes: model
[05/30/2023-10:20:41] [I] === System Options ===
[05/30/2023-10:20:41] [I] Device: 0
[05/30/2023-10:20:41] [I] DLACore: 
[05/30/2023-10:20:41] [I] Plugins:
[05/30/2023-10:20:41] [I] === Inference Options ===
[05/30/2023-10:20:41] [I] Batch: 1
[05/30/2023-10:20:41] [I] Input inference shapes: model
[05/30/2023-10:20:41] [I] Iterations: 10
[05/30/2023-10:20:41] [I] Duration: 3s (+ 200ms warm up)
[05/30/2023-10:20:41] [I] Sleep time: 0ms
[05/30/2023-10:20:41] [I] Idle time: 0ms
[05/30/2023-10:20:41] [I] Streams: 1
[05/30/2023-10:20:41] [I] ExposeDMA: Disabled
[05/30/2023-10:20:41] [I] Data transfers: Enabled
[05/30/2023-10:20:41] [I] Spin-wait: Disabled
[05/30/2023-10:20:41] [I] Multithreading: Disabled
[05/30/2023-10:20:41] [I] CUDA Graph: Disabled
[05/30/2023-10:20:41] [I] Separate profiling: Disabled
[05/30/2023-10:20:41] [I] Time Deserialize: Disabled
[05/30/2023-10:20:41] [I] Time Refit: Disabled
[05/30/2023-10:20:41] [I] NVTX verbosity: 0
[05/30/2023-10:20:41] [I] Persistent Cache Ratio: 0
[05/30/2023-10:20:41] [I] Inputs:
[05/30/2023-10:20:41] [I] === Reporting Options ===
[05/30/2023-10:20:41] [I] Verbose: Disabled
[05/30/2023-10:20:41] [I] Averages: 10 inferences
[05/30/2023-10:20:41] [I] Percentiles: 90,95,99
[05/30/2023-10:20:41] [I] Dump refittable layers:Disabled
[05/30/2023-10:20:41] [I] Dump output: Disabled
[05/30/2023-10:20:41] [I] Profile: Disabled
[05/30/2023-10:20:41] [I] Export timing to JSON file: 
[05/30/2023-10:20:41] [I] Export output to JSON file: 
[05/30/2023-10:20:41] [I] Export profile to JSON file: profile.json
[05/30/2023-10:20:41] [I] 
[05/30/2023-10:20:41] [I] === Device Information ===
[05/30/2023-10:20:41] [I] Selected Device: NVIDIA GeForce RTX 2080 Ti
[05/30/2023-10:20:41] [I] Compute Capability: 7.5
[05/30/2023-10:20:41] [I] SMs: 68
[05/30/2023-10:20:41] [I] Compute Clock Rate: 1.545 GHz
[05/30/2023-10:20:41] [I] Device Global Memory: 11019 MiB
[05/30/2023-10:20:41] [I] Shared Memory per SM: 64 KiB
[05/30/2023-10:20:41] [I] Memory Bus Width: 352 bits (ECC disabled)
[05/30/2023-10:20:41] [I] Memory Clock Rate: 7 GHz
[05/30/2023-10:20:41] [I] 
[05/30/2023-10:20:41] [I] TensorRT version: 8.5.2
[05/30/2023-10:20:41] [I] Engine loaded in 0.0537632 sec.
[05/30/2023-10:20:41] [I] [TRT] Loaded engine size: 38 MiB
[05/30/2023-10:20:42] [W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[05/30/2023-10:20:42] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +41, now: CPU 0, GPU 41 (MiB)
[05/30/2023-10:20:42] [I] Engine deserialized in 0.504064 sec.
[05/30/2023-10:20:42] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +28, now: CPU 0, GPU 69 (MiB)
[05/30/2023-10:20:42] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[05/30/2023-10:20:42] [I] Setting persistentCacheLimit to 0 bytes.
[05/30/2023-10:20:42] [I] Using random values for input images
[05/30/2023-10:20:42] [I] Created input binding for images with dimensions 1x3x640x640
[05/30/2023-10:20:42] [I] Using random values for output output
[05/30/2023-10:20:42] [I] Created output binding for output with dimensions 1x25200x85
[05/30/2023-10:20:42] [I] [TRT] The profiling verbosity was set to ProfilingVerbosity::kLAYER_NAMES_ONLY when the engine was built, so only the layer names will be returned. Rebuild the engine with ProfilingVerbosity::kDETAILED to get more verbose layer information.
[05/30/2023-10:20:42] [I] Starting inference
[05/30/2023-10:20:45] [I] The e2e network timing is not reported since it is inaccurate due to the extra synchronizations when the profiler is enabled.
[05/30/2023-10:20:45] [I] To show e2e network timing report, add --separateProfileRun to profile layer timing in a separate run or remove --dumpProfile to disable the profiler.
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --loadEngine=yolov7_NOT_dynamic_batch_INT8.engine --exportProfile=profile.json --exportLayerInfo=graph.json

How I can get file profile.metadata.json from generated .engine model?

I saw that the repo TensorRT/tutorial.ipynb at main · NVIDIA/TensorRT · GitHub needs 3 json file. If we generate .engine model not by using trtexec, how we can get full json files for Tensorrt engine explorer?
Thanks

AakankshaS · June 6, 2023, 7:56am

Hi,
Please check the below link, as they might answer your concerns

Thanks!

spolisetty · June 16, 2023, 11:04am

Hi,

process_engine.py will do this automatically,

github.com

NVIDIA/TensorRT/blob/v8.6.1/tools/experimental/trt-engine-explorer/utils/README.md

# Utility scripts
  * [process_engine.py](#process-enginepy)
  * [draw_engine.py](#draw-enginepy)
  * [parse_trtexec_log.py](#parse-trtexec-logpy)

<br>

## process_engine.py

`process_engine.py` is used to:
1. Build a TensorRT engine from an ONNX file.
2. Profile an engine plan file.
3. Generate JSON files for exploration with trex.
4. Draw an SVG graph from an engine.

```
usage: process_engine.py [-h] [--print_only] [--build_engine] [--profile_engine] [--draw_engine] input outdir [trtexec [trtexec ...]]

Utility to build and profile TensorRT engines

This file has been truncated. show original

Thank you.

Topic		Replies	Views
Profile inference time of each layer for .engine model to know where is bottleneck in Deepstream? DeepStream SDK	17	1074	June 19, 2023
Exploring NVIDIA TensorRT Engines with TREx Technical Blog	0	499	June 16, 2022
Trt-engine-explorer fails to parse trtexec output TensorRT tensorrt	5	1284	August 22, 2022
Error loading TRTModelFileDocument form JSON Nsight DL Designer	3	61	September 30, 2025
Generation of tensorrt engine using onnx model file TensorRT tensorrt , gstreamer , deepstream	3	639	December 21, 2023
Visualizing tensorRT engine TensorRT	9	8316	December 3, 2022
TensorRT Python API ProfilingVerbosity DETAILED level generate the same information as LAYER_NAMES_ONLY level generate TensorRT	1	1361	January 25, 2022
Tao-converter engine output with profilingVerbosity=kDETAILED TAO Toolkit	2	360	December 21, 2022
Run engine trt file on image/video Jetson TX2 tensorrt	8	1651	October 18, 2021
Can trtexec command generate ops count for onnx model and final engine mode on Orin Eval box? TensorRT tensorrt , jetson-inference	3	714	September 24, 2022

How to get file profile.metadata.json to use TRT engine explorer?

Related topics