TensorRT Inference Server system RAM usage climbs until container is closed by OS

sidney.rubidge · June 20, 2019, 3:06pm

I am running TensorRT Inference Server v19.05 locally using the nvidia docker container from NGC with the following command.

nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p9000:8001 -p8002:8002 -v /path/to/model_repository:/models nvcr.io/nvidia/tensorrtserver:19.05-py3 trtserver --model-store=/models

I am using the example model repository provided by nvidia. When I make continuous inference requests via grpc to the resnet50_netdef model the system ram usage of the trtserver process continues to rise until it reaches almost 100% and the OS closes it.

How do I stop the RAM usage from continuously climbing?

My system specs follow:
Operating system: Ubuntu 18.04
RAM: 32GB
Docker version: Docker version 18.09.6, build 481bc77

David_Goodwin · June 21, 2019, 8:48pm

There is a known issue with the grpc front-end that its memory usage will grow up to a certain point before stabilizing. Can you try setting the following flags and see if that resolves the issue:
–grpc-infer-thread-count=16 --grpc-stream-infer-thread-count=16

sidney.rubidge · June 23, 2019, 7:59pm

Thank you David, that solves the issue.

Topic		Replies	Views
how to release the TRTIS memory Triton Inference Server (archived)	2	1050	October 23, 2019
Problem with accumulating gpu memory usage in tritonserver TensorRT cudnn , inference-server-triton , deepstream	0	282	September 3, 2024
Triton server memory accumulation problem TensorRT cudnn	1	445	March 14, 2024
Random spikes in RAM while using Triton Inference TensorRT tensorrt , cuda , ubuntu , inference-server-triton	1	518	August 3, 2023
Tf consuming too much RAM memory Jetson AGX Orin tensorrt , tensorflow , python	3	652	March 7, 2023
Official Tensorflow uses all RAM when running on GPU Jetson Nano	6	2393	May 21, 2019
TRTIS randomly quit without any error message Triton Inference Server (archived)	0	604	May 12, 2020
TensorRT model consuming more amount of RAM Jetson TX2 tensorrt	2	983	August 20, 2020
Memory exceeded error when running triton-inference-server General Topics & Other SDKs cuda , inference-server-triton , gpu	0	1365	February 2, 2023
DeepStream 6.0.1 Triton GRPC memory leak DeepStream SDK nvbugs	23	3124	September 2, 2022

TensorRT Inference Server system RAM usage climbs until container is closed by OS

Related topics