TensorRT memory management

user130951 · February 21, 2022, 3:57am

Description

Hi! I have been using TensorRT for a cuple of months, and I wonder if there is a way that I can manage the memory use myself. Because there are more than one TensorRT engines needed to be deployed when the program is running, and the problem is:
Everytime a new engine loading to the memory will lock a specific part of memory. However, the image processing functions also require GPU memory usage. Therefore in some cases, the memory segments might lead to the loading fail problem.
So is there a function or API that allows me to lock a specific part of the GPU memory before any of the engine deployed? Also, even the engine is unloaded, this part of the memory is still not accessible until the program exits or the process is killed.
This thing is similar to the tensorflow memory-pool, but I am not quite sure if TensorRT has similar features. Could anyone give me some help or sample code that I can refer to?

Environment

TensorRT Version: 7.2.2.3
GPU Type: Tesla V100
Nvidia Driver Version: 450.51.05
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

spolisetty · February 24, 2022, 10:10am

Hi,

We can set up the allocator by ourselves with setGpuAllocator
It requires us to implement our own allocator.
I don’t think we have sample code that implements a memory pool for now.

Besides,

This should not happen. As long as you correctly destroy all executionContext and engines all memory will be free.

Please refer to TensorRT samples at Sample Support Guide :: NVIDIA Deep Learning TensorRT Documentation
https://github.com/NVIDIA/TensorRT/tree/master/samples

Thank you.

NVES · February 24, 2022, 10:37am

Hi,
Please check the below link, as they might answer your concerns

Thanks!

user130951 · February 24, 2022, 10:50am

Hi,
Thank you for your reply. That is what I am looking for. Really appreaciate!

user130951 · February 24, 2022, 10:59am

Hi,
Thank you for your reply, I have done some experiment on creating IGpuAllocators to control the memory use when deserializing engines. This one seems to work before the IRuntime instance being destroyed. I am not sure if I am correct.

system · March 10, 2022, 10:59am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deallocate memory assigned using IExecutionContext TensorRT tensorrt	7	663	May 10, 2023
Free TensorRT GPU memory using Python API TensorRT	2	2775	March 23, 2021
How to release all gpu memory after saving built engine? TensorRT tensorrt	1	525	July 19, 2022
Memory deallocation in engine TensorRT	0	604	June 10, 2019
How can I release GPU memory without terminating the execution process TensorRT tensorrt , python	2	1754	June 10, 2022
How to manage the GPU memory and Host Memory? Especially release to OS TensorRT	1	616	May 6, 2020
Memory leak in TensorRT 6? TensorRT	8	1645	October 12, 2021
TensorRT engine context use mem TensorRT tensorrt	5	1258	July 5, 2022
TensorRT CPU Memory Management TensorRT jetson-inference , jetson	5	1710	July 7, 2022
Getting memory usage from trtexec output on Jetson Jetson AGX Orin tensorrt	2	1057	March 10, 2023

TensorRT memory management

Description

Environment

Relevant Files

Steps To Reproduce

Related topics