Hi! I have been using TensorRT for a cuple of months, and I wonder if there is a way that I can manage the memory use myself. Because there are more than one TensorRT engines needed to be deployed when the program is running, and the problem is:
Everytime a new engine loading to the memory will lock a specific part of memory. However, the image processing functions also require GPU memory usage. Therefore in some cases, the memory segments might lead to the loading fail problem.
So is there a function or API that allows me to lock a specific part of the GPU memory before any of the engine deployed? Also, even the engine is unloaded, this part of the memory is still not accessible until the program exits or the process is killed.
This thing is similar to the tensorflow memory-pool, but I am not quite sure if TensorRT has similar features. Could anyone give me some help or sample code that I can refer to?
TensorRT Version: 184.108.40.206
GPU Type: Tesla V100
Nvidia Driver Version: 450.51.05
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered