TensorRT Inference Server system RAM usage climbs until container is closed by OS

I am running TensorRT Inference Server v19.05 locally using the nvidia docker container from NGC with the following command.

nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p9000:8001 -p8002:8002 -v /path/to/model_repository:/models nvcr.io/nvidia/tensorrtserver:19.05-py3 trtserver --model-store=/models

I am using the example model repository provided by nvidia. When I make continuous inference requests via grpc to the resnet50_netdef model the system ram usage of the trtserver process continues to rise until it reaches almost 100% and the OS closes it.

How do I stop the RAM usage from continuously climbing?

My system specs follow:
Operating system: Ubuntu 18.04
Docker version: Docker version 18.09.6, build 481bc77

There is a known issue with the grpc front-end that its memory usage will grow up to a certain point before stabilizing. Can you try setting the following flags and see if that resolves the issue:
–grpc-infer-thread-count=16 --grpc-stream-infer-thread-count=16

Thank you David, that solves the issue.