How to solve "buildCudaEngine" cost long time

Hi,i use TensorRT API to create my net and when the program run to the line “engine_ = builder_->buildCudaEngine(*network_);”, it cost very long time, I want to know why and how to shorten the time ?



buildCudaEngine takes time since TensorRT optimizes the kernel with the information of model and GPU architecture.

A typical workflow is to serialize compiled engine(called PLAN) and relaunch TensorRT with PLAN to save building time.

Here is a sample code of PLAN serialize and de-serialize for your reference:


thanks for your reply.

I saved my PLAN to a file and then de-seialize from the PLAN file for my program, but next the line “nvinfer1::ICudaEngine* engine = runtime->deserializeCudaEngine(modelMen, modelsize, &pluginFactory);” failed!

I guess the custom layer’s parameter is not saved in the PLAN file.can you give me some advise?

Is there any some demo for the PLAN file and contain custom layer ?


You can find here for the plugin sample: