How to make trt engine's initialization faster?

i work on nvidia nx width c++.
i found that was too long to run the functions ( “createInferRuntime”,“deserializeCudaEngine”) at the first time.
is there any way to lat it faster?

Hi @fanyj233,

Please refer the following doc, Best Practices For TensorRT Performance
We recommend you to provide more details of issue and reproducible model/scripts.

Thank you.

Hi @spolisetty @fanyj233
I’m struggled with this problem for a long time. TensorRT will still cost 8~10s for deserializing engine and creating context. I find some previous discusions, but these replies cannot solve my problem exactly. I think it’s certainly a performance issue for TensorRT using in realtime case.

ref: nvinfer1::ICudaEngine deserializeCudaEngine takes 40-60 sec
ref: TensorRT Caching mechanism not very fast. deserializeCudaEngine takes some time