create engine file from onnx (yolo)

is there a script somewhere that converts onnx files to engine files? also is it universal? as in, can I use the same script that converts ssd onnx files to convert yolo onnx files?

I tried cmaking onnx-tensort but that gives:

The source directory


  does not contain a CMakeLists.txt file.

if I try
sudo python install
I get:

building 'onnx_tensorrt.parser._nv_onnx_parser_bindings' extension
swigging nv_onnx_parser_bindings.i to nv_onnx_parser_bindings_wrap.cpp
swig -python -c++ -modern -builtin -o nv_onnx_parser_bindings_wrap.cpp nv_onnx_parser_bindings.i
NvOnnxParser.h:213: Error: Syntax error in input(1).
error: command 'swig' failed with exit status 1


Please refer below sample for Yolo to ONNX conversion:

You can use “trtexec” command line tool for model optimization, understanding performance and possibly locate bottlenecks.

Please find the below links for your reference:


I am using yolo, so I do not have a prototxt file as far as I know (only pb). I tried converting my onnx file via:

trtexec --onnx=yolov2-tiny-voc.onnx  --saveEngine=yolov2-tiny-voc.engine

but it gives me

&&&& RUNNING TensorRT.trtexec # trtexec --onnx=yolov2-tiny-voc.onnx --saveEngine=yolov2-tiny-voc.engine
[01/21/2020-10:22:19] [I] === Model Options ===
[01/21/2020-10:22:19] [I] Format: ONNX
[01/21/2020-10:22:19] [I] Model: yolov2-tiny-voc.onnx
[01/21/2020-10:22:19] [I] Output:
[01/21/2020-10:22:19] [I] === Build Options ===
[01/21/2020-10:22:19] [I] Max batch: 1
[01/21/2020-10:22:19] [I] Workspace: 16 MB
[01/21/2020-10:22:19] [I] minTiming: 1
[01/21/2020-10:22:19] [I] avgTiming: 8
[01/21/2020-10:22:19] [I] Precision: FP32
[01/21/2020-10:22:19] [I] Calibration: 
[01/21/2020-10:22:19] [I] Safe mode: Disabled
[01/21/2020-10:22:19] [I] Save engine: yolov2-tiny-voc.engine
[01/21/2020-10:22:19] [I] Load engine: 
[01/21/2020-10:22:19] [I] Inputs format: fp32:CHW
[01/21/2020-10:22:19] [I] Outputs format: fp32:CHW
[01/21/2020-10:22:19] [I] Input build shapes: model
[01/21/2020-10:22:19] [I] === System Options ===
[01/21/2020-10:22:19] [I] Device: 0
[01/21/2020-10:22:19] [I] DLACore: 
[01/21/2020-10:22:19] [I] Plugins:
[01/21/2020-10:22:19] [I] === Inference Options ===
[01/21/2020-10:22:19] [I] Batch: 1
[01/21/2020-10:22:19] [I] Iterations: 10
[01/21/2020-10:22:19] [I] Duration: 3s (+ 200ms warm up)
[01/21/2020-10:22:19] [I] Sleep time: 0ms
[01/21/2020-10:22:19] [I] Streams: 1
[01/21/2020-10:22:19] [I] ExposeDMA: Disabled
[01/21/2020-10:22:19] [I] Spin-wait: Disabled
[01/21/2020-10:22:19] [I] Multithreading: Disabled
[01/21/2020-10:22:19] [I] CUDA Graph: Disabled
[01/21/2020-10:22:19] [I] Skip inference: Disabled
[01/21/2020-10:22:19] [I] Input inference shapes: model
[01/21/2020-10:22:19] [I] Inputs:
[01/21/2020-10:22:19] [I] === Reporting Options ===
[01/21/2020-10:22:19] [I] Verbose: Disabled
[01/21/2020-10:22:19] [I] Averages: 10 inferences
[01/21/2020-10:22:19] [I] Percentile: 99
[01/21/2020-10:22:19] [I] Dump output: Disabled
[01/21/2020-10:22:19] [I] Profile: Disabled
[01/21/2020-10:22:19] [I] Export timing to JSON file: 
[01/21/2020-10:22:19] [I] Export output to JSON file: 
[01/21/2020-10:22:19] [I] Export profile to JSON file: 
[01/21/2020-10:22:19] [I] 
Input filename:   yolov2-tiny-voc.onnx
ONNX IR version:  0.0.4
Opset version:    11
Producer name:    tf2onnx
Producer version: 1.6.0
Model version:    0
Doc string:       
[01/21/2020-10:22:19] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[01/21/2020-10:22:19] [E] [TRT] Network validation failed.
[01/21/2020-10:22:19] [E] Engine creation failed
[01/21/2020-10:22:19] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # trtexec --onnx=yolov2-tiny-voc.onnx --saveEngine=yolov2-tiny-voc.engine

I also tried :

trtexec --onnx=yolov2-tiny-voc.onnx --deploy=yolov2-tiny-voc.pb --saveEngine=yolov2-tiny-voc.engine --useCudaGraph

but that gives me:

Note: CUDA graphs is not supported in this version.


In this case your model doesn’t have static dimension.
For dynamic input model you have to provide optimization profile. Also, in TRT 7 for ONNX model you have to specify “–explicitBatch” while optimizing the model.

Please refer below link for more details: