Model run using nvinferserver occupying high GPU memory-usage

Hi @ersheng,
I had tried the suggested tf-trt guide but it get trt_engine_opts as 0. So I tried the above suggested changes to config.pbtxt for optimization which converted portion of the graphs to trt_engines.
I further had trouble running the converted on-the-fly model with changed config which I have created a topic for, here is the reference:

Basically the parameter to be given for Tf-trt conversion or the model itself seems to be an issue.