Tensorrt cost too many device memory.Can i set some config to reduce the device memory?

I use the tacotron2 tensorrt model on a v100 16G card,Even I turn the model to fp16,it still cost about 12G device memory.Dosr anyone know some config to reduce the device memory which the tensorrt engine cost.PS when I use the pytorch,it cost only 2.2G


Network with multiple dynamic shape number will cause extra memory usage.
Since there are multiple layers with dynamic shapes in encoder and decoder, that might be causing the extra memory usage while running TRT.