Limiting a tensorflow deployed savedmodel memory consumption

I’ve deployed a tensorflow model via saved model format,
I see that this tensorflow model consumes all of gpu memory,

I know there is an “allow growth” option when creating a session that should tell tensorflow to consume only required memory.

How should it be done when deploying a tensorflow saved model?


By default TRTIS sets the allow-growth option for tensorflow, so it should not automatically reserve all of GPU memory. You can use the --tf-gpu-memory-fraction option when stating trtserver to tell TF to reserve some fraction of the GPU memory and not use any additional. For example, --tf-gpu-memory-faction=0.5 indicates that TF should result 50% of the GPU memory at startup, but that it should not use any memory beyond that.

–tf-gpu-memory-fraction works , thanks