How to use tensorrt when continuously do inference without delay?

I meet a problem when using tensorRT on successive frames of video without a “sleep”. The program works normally at first executions, but much slower in later ones. When I put an interval of several ms between 2 inputs, every execution consomes the “normal” time.