Would tensorrt optimize the memory consumption?

hello,would tensorrt optimize the memory consumption?
using tensorrt inference,the memory consumption would decrease?

TensorRT optimizes the network by combining layers and optimizing kernel selection for improved latency, throughput, power efficiency, and memory consumption.
TensorRT serialized engine file size can be larger than original model. But as far as memory usage during runtime, please refer to “How do I determine how much device memory will be required by my network?” section in below link:


Thanks for your reply.
I HAVE learned the link.
Is there any official test for memory usage of typical models for TensorRT inferece during runtime?