Huge speed difference between engines built from scratch and engines built from onnx

Description

I have a yolov5 model which I would like to deploy.
I found that if I convert my model from onnx to TensorRT, trtexec indicates an inference speed of 25 fps.
But if I build the model layer for layer using INetworkDefinition, the inference speed triples.
How come the TensorRT model is so much faster when explicitly building the model instead of converting from onnx?
Both cases use int8 quantization.

Thanks!

Environment

TensorRT Version: 7.1.3
GPU Type: Jetson Xavier AGX
CUDA Version: 10.2.89
CUDNN Version: 8.0
Operating System + Version: Jetpack 4.5.1

Hi @frederikschoeller,

It depends, sometimes ONNX parser could introduce some additional ops, which may affect the inference speed.

Thank you.

Hi @frederikschoeller,

We are working on this issue. Could you please share us issue repro script of manually defining the network.

Thank you.

Hi @frederikschoeller,

When you get a chance could you please share us issue repro as requested above to work on this issue.

Thank you.

Hi!

does this suffice?

Hi @frederikschoeller,

We have checked both log for the first conv, both used trt_volta_int8x4_icudnn_int8x4_128x32_relu_small_c32_nn_v1 from the profiler. But their time is different 729.47us vs. 1.4192ms, It makes feel like they are the same model but with different problem size. Then we checked the attached onnx yolov5s6.onnx, the onnx input size is [1,3,1280,1280], but from the tensorrtx/yololayer.h at master · wang-xinyu/tensorrtx · GitHub, the code build from scratch using input size [1, 3, 640, 640] .

Could you please check if you are comparing using the same problem size?

Thank you.

I checked, and the model build from scratch indeed uses input size [1,3,1280,1280]

Hi @frederikschoeller , when you said building from scratch using the INetworkDefinition do you mean that you build it using the C++ code?

Hi @frederikschoeller,

Could you please provide the verbose log when building the engine by setting Severity::kVERBOSE in the code.

Thank you.

Hi @frederikschoeller,

Could you please share these details(issue repro). Which will be helpful to fix this issue.

Thank you.