After calling deepstream with fastAPI, there is a large amount of memory to be released

• Hardware Platform (Jetson / GPU):Jetson
• DeepStream Version:6.2
• JetPack Version (valid for Jetson only):5.1.4

I am currently facing a problem where I am using Docker to deploy DeepStream 6.2. I am compiling DeepStream into a dynamic link library (exposing two function interfaces: start task and stop task) and using FastAPI to call these two interfaces. Now calling the stop task interface, although the task has stopped, there is still a large amount of memory in DeepStream that has not been released (FastAPI is still running as a whole, so the entire program has not ended: unlike the DeepStream test example program, which ends the recognition and directly ends the entire process, and this is not the memory I manually allocated). I think it should be the cache allocated by the TensorRT model during the running process. How can I clear these caches?

There is no clue in your description.

How did you implement the APIs? What happened inside the APIs?