Why tensorRT occupy many memory ?

ClancyLian · January 4, 2018, 6:58am

Hi,
I have used tensorrt to accelerate my application instead of caffe, and I found the memory would add about 400MB when I parse two model, I want to execute 4 processes in TX2, but now I can only run 2 because of the tensorrt ouucpy too many memory, I have not create too many buffer, but when I only new a tensorrt object construct function, the memory would go up quickly.

AastaLLL · January 5, 2018, 6:03am

Hi,

There is two source consumes memory:
1. Loading libraries: (TensorRT, cuDNN, cuBLAS…)

Amount: around 600Mib (TensorRT3)
Required but is shared with all the processes.

2. Building inference engine:

Amount: depends on the network size
Can be limited by setMaxWorkspaceSize() and setMaxBatchSize(). Each process has their own consumption.

Thanks.

ClancyLian · January 10, 2018, 11:23am

Hi, AastaLLL,

When I didn’t start any process, the memory use about ( 2388 / 7851 MB),
When I start a process, the memory use about (3592 / 7851 MB),
When I stary second process, the memory would be (6000 / 7851 MB), I dont konw why the second process would be twice times than the first process when use tensorRT, when I only use caffe, second process would be using the same memory with first process.

AastaLLL · January 12, 2018, 10:38am

Hi

The data of (3592 / 7851 MB) → (6000 / 7851 MB) is using TensorRT or Caffe?

Could share the detail memory usage for both TensorRT and Caffe case?
More, please also share the information about your model.

Thanks.

ClancyLian · January 15, 2018, 12:28am

Hi,

I have use three model with MTCNN, P net, R net and O net. In attachment is the model.

And I used tensorRT for R net and O net. In these two nets, I write Prelu layer for them.

And I used caffe for P net, because tensorRT can’t support for the dynamic resolution input.
[url]how to fit multi pyramids in tensorRT ? - Jetson TX2 - NVIDIA Developer Forums

I have tested just use R net and O net, it also would appear this phenomenon.

model.rar (2.08 MB)

ClancyLian · January 15, 2018, 3:44am

Hi,

I have tested it again with Rnet and O net. Now, I have retrained the model, and use ReLU instead of PReLU, so I didn’t have to write plugin layer myself. And I found the memory raise up 600MB when I started a process, the second process would also about 600MB, so I think the issue may occur in plugin, so I add the plugin in attachment.
trtplugin.h (9.1 KB)

AastaLLL · January 16, 2018, 9:19am

Hi,

From your description, the abnormal memory should be allocated from plugin implementation.
Quick check your source, there is an allocation call for PRelu parameters:

CHECK(cudaMalloc(&deviceData, count * sizeof(float)));

But the weight of PRelu layer should be few.
Could you tell us or check the value of allocated memory amount?

Thanks.

zykincs · March 12, 2021, 5:07am

Hi AastaLLL,

As you mentioned, the libraries are shared with all processes. But in my experiments, I found the memory usage is linearly increased when the number of process increased. (increased ~800MB for each process) Do we need to do any setting to make sure the libaries are shared among all processes on the same GPU? Thanks!

kayccc · March 17, 2021, 6:49am

Hi zykincs,

Please help to open a new topic for your issue.

Thanks

zykincs · May 12, 2021, 1:52am

Hi kayccc,

I have created a new topic there: cuDNN take up too much GPU memory
Could you help to take a look? Thanks

Topic		Replies	Views
TensorRT model consuming more amount of RAM Jetson TX2 tensorrt	3	883	October 18, 2021
How to reduce TRT memory use ? Jetson TX2	4	1986	October 18, 2021
Expected Tensor RT 8 RAM Usage Jetson TX2 tensorrt	2	518	March 2, 2022
Gpu memory usage size of TensorRT3 engine Jetson TX2	8	1575	October 18, 2021
TensorRT used lots of memory when loading model files Jetson Orin NX tensorrt	6	1100	May 31, 2023
Memory Usage Discrepancy with TensorRT 8.6 and 8.2 Jetson TX2 tensorrt	3	339	March 27, 2024
Tensorflow-gpu using high system memory, which is the bottleneck Jetson TX2 cuda , tensorflow	4	640	October 18, 2021
How to share tensorrt between processes Jetson AGX Xavier tensorrt	6	1007	March 1, 2022
GPU vs CPU deep learning memory usage Jetson Nano cudnn	5	670	March 26, 2024
Lowering tensorrt memory usage Jetson TX2 tensorrt	4	583	May 16, 2023

Why tensorRT occupy many memory ?

Related topics