New TensorRT Model occupying more GPU Memory as compared to older version

meet · August 12, 2021, 7:44am

Description

I am converting a tensorflow model (.h5 → saved_model_format → tensorrt model) to a tensorrt model using tensorflow 2.5.0 (attached tensorrt.py). It is (tensorrt model) is occupying almost 3.5GB of GPU Memory.

If I load the same model in the below specified environment, then the tensorrt model is occupying max ~1.1GB of GPU memory:
TensorRT Version: 5.1.2.2-1+cuda10.1
GPU Type: GeForce RTX 2080 Ti
Nvidia Driver Version: 418.87.00
CUDA Version: 10.1
CUDNN Version: 7.6.2
Operating System + Version: Ubuntu 16.04.7 LTS
Python Version (if applicable): 3.6.13
TensorFlow Version (if applicable): 1.14.1
PyTorch Version (if applicable): NA
Baremetal or Container (if container which image + tag): NA

I also tried using tensorrt-nv-tensorrt-repo-ubuntu1804-cuda11.3-trt8.0.1.6-ga-20210626_1-1_amd64.deb with TF2.5, but the code won’t run as it requires libnvinfer.so.7. And the code only runs if we use CUDA11.1.

Environment

TensorRT Version: 7.2.3-1+cuda11.1
GPU Type: NVIDIA GeForce RTX 3080
Nvidia Driver Version: 470.57.02
CUDA Version: 11.2 & 11.1(Got with the installation of TensorRT)
CUDNN Version: 8.1.0.77-1+cuda11.2
Operating System + Version: Ubuntu 18.04.5 LTS
Python Version (if applicable): 3.9.6
TensorFlow Version (if applicable): 2.5.0
PyTorch Version (if applicable): NA
Baremetal or Container (if container which image + tag): NA

Relevant Files

tensorrt.py (4.2 KB)
dummy.h5 (4.6 MB)
gpu_usage (272 Bytes)

Output files:models

gpu_usage_dummy.txt (1.4 KB)
tensorrt_output.txt (16.5 KB)

Steps To Reproduce

Setting up the environment:

nvidia-driver installation
libnvinfer installation as mentioned here - libnvinfer7.2.3-1+cuda11.1
cuda installation steps: here - change ( ```
sudo apt-get -y install cuda-11-2

* CUDNN installation: [from .deb](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-deb)
* ```
sudo apt-get install -y --no-install-recommends libnvinfer7=7.2.3-1+cuda11.1 \    libnvinfer-dev=7.2.3-1+cuda11.1 \    libnvinfer-plugin7=7.2.3-1+cuda11.1

tensorflow installation: pip install tensorflow==2.5.0

Adding paths in .bashrc:

export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/bin/bash

Running the code:

Put the dummy.h5 model in models directory
Run tensorrt.py & run gpu_usage to obsereve the memory occupied

NVES · August 12, 2021, 8:08am

Hi,
Please refer to the installation steps from the below link if in case you are missing on anything

Also, we suggest you to use TRT NGC containers to avoid any system dependency related issues.

Thanks!

meet · August 12, 2021, 8:38am

I am not installing tensorrt using this guide. I am following tf-docs for this.

Do I need to install tensorrt as mentioned in Installation Guide :: NVIDIA Deep Learning TensorRT Documentation ? Because I am able to convert the model to tensorrt format without this.

meet · August 13, 2021, 4:58am

Hey @NVES ,

I followed the installation steps mentioned in the install-guide. But still the memory issue is there.

Thanks

spolisetty · August 17, 2021, 7:04pm

Hi @meet,

Could you please try using TF-TRT on Tensorflow NGC container, and let us know if you still face this issue.

Thank you.

meet · August 18, 2021, 4:03am

Hey @spolisetty,

I tried the latest container - 21.07-tf2-py3, but the issue is still there. It is also occupying way more memory than 2080ti environment mentioned above in the post.

meet · August 19, 2021, 6:07am

Hey @spolisetty,

I also tried pytorch NGC. I observed that when I run the same inference script in both 3080 & 2080 ti (Both have different environments). The process occupies 1.7GB & 1GB of GPU respectively. And the inference time is less for 2080ti than 3080.

Can this be something related to GPU or GPU-Drivers?

spolisetty · August 19, 2021, 6:43pm

Hi @meet,

Yes, GPU architecture and compute capability does matter.
Usually we ship high end GPU with more device memory, also new arch would support new unit like tensorCore, this allow us to develop more fancy kernels that use more memory to speed up your NN.

If you observe more difference in the inference time, please let us know steps to reproduce the issue.

Thank you.

meet · August 20, 2021, 4:45am

Hey @spolisetty,

If this is the case, then can you tell me why going from 21.07 Tensorflow-NGC to 20.11 Tensorflow-NGC decreased the gpu-memory usage from 3.5GB to 1.3GB?

We tested with both TF1.15 & TF2.5 in 20.11 Tensorflow-NGC, both are occupying same gpu-memory. So we can rule out the TF version issue in this case.

From the inference side, I have 3080 on a 8 core CPU & 2080ti on a 32 core CPU.
As of now I am assuming it could be because of CPU bottleneck.

We will be testing the 3080 with a CPU with more cores. Will keep this thread updated with the results.

Topic		Replies	Views
TF-TRT optimization TensorRT tensorrt , tensorflow , jetson-inference	4	4947	June 2, 2021
Tensorrt take much more cpu ram in RTX3070 GPU-Accelerated Libraries cublas	7	1807	October 15, 2021
Where an I download TensorRT 6 tar file TensorRT tensorrt	6	1242	October 12, 2021
Module 'tensorrt' has no attribute 'Logger' TensorRT	4	1251	January 3, 2023
TF-TRT on Jetson Nano Jetson Nano tensorrt , tensorflow , ubuntu	4	3101	June 25, 2021
Tensorflow inference using TRT converted model TensorRT	10	1054	May 25, 2021
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	16	462	December 8, 2020
TensorRT for Cuda 12.2 TensorRT	9	11070	July 24, 2024
TensorRT 8.0.0 network installation issue TensorRT	5	1557	May 11, 2021
Tf-trt conversion got killed TensorRT tensorrt , tensorflow , jetson-inference	3	746	April 22, 2021

New TensorRT Model occupying more GPU Memory as compared to older version

Description

Environment

Relevant Files

Output files:models

Steps To Reproduce

Setting up the environment:

Adding paths in .bashrc:

Running the code:

Related topics