Tensorflow-gpu using high system memory, which is the bottleneck

harendra · August 26, 2021, 5:28am

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 18.04): TX2 jetpack 4.5.1-b17
TensorFlow installed from (source or binary): source (tensorflow 2.6.0/2.3.1)
TensorFlow version (use command below): tensorflow 2.6.0 with cuda 10.2
CUDA/cuDNN version: CUDA Toolkit 10.2 / cuDNN 8.0
System memory: 8GB

I use tensorflow with cuda for object inferencing. It usage the system memory as high as 2 GB. Whcih is botolnetcj for me. I am using c++ based inference code.

Tried the below option but it did not help.
options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(0.2);
options.config.mutable_gpu_options()->set_allow_growth(true);
options.config.mutable_gpu_options()->force_gpu_compatible();
options.config.set_inter_op_parallelism_threads(1);
options.config.set_intra_op_parallelism_threads(1);
options.config.set_use_per_session_threads(false);

Opened a ticker on tensorflow but those guys are sayign it is the cuda which is takign huge memeory. when I run same code without cuda and on CPU the memory usages is reduced by 2.4 GB.

Let me know how to overcome this issue.

AastaLLL · August 26, 2021, 7:26am

Hi,

It’s a known issue from TensorFlow implementation

Our suggestion is to use other edge-friendly frameworks like TensorRT.
In general, TensorFlow will occupy 2x~3x memory compared to the TensorRT.

Thanks.

harendra · August 26, 2021, 7:53am

Thanks for the reply. We will definitely try tensorRT. I have one follow up question :
When we posted this issue to tensorflow, they pointed that CUDA libraries themselves are 2.3 GB in size. And hence this memory consumption. Does it mean that all CUDA libraries need to be loaded at once into memory, to make GPU functional ?

AastaLLL · September 7, 2021, 6:55am

Hi,

The answer should be yes.
TensorFlow uses cuDNN to implement GPU inference function, which requires lots of memory.
TensorRT also has a similar problem but slight due to its optimization.

However, in our latest TensorRT release (v8.0 in JetPack4.6)
We provide an alternative for user to deploy a model with cuBLAS instead of cuDNN to save memory.

Thanks.

Topic		Replies	Views
Odd behavior with Jetpack 3.2 and tensorflow Jetson TX2	4	1031	October 18, 2021
Difference of memory usage at each GPU model during tensorflow c++ inference Frameworks tensorflow	3	1520	November 20, 2019
2.5GB of video memory missing in TensorFlow on both Linux and Windows [RTX 3080] TensorRT cuda , tensorflow , python , gpu	9	2769	August 6, 2022
Memory Usage Discrepancy with TensorRT 8.6 and 8.2 Jetson TX2 tensorrt	3	339	March 27, 2024
Tensorflow on TX2 GPU sync error Jetson TX2	6	4536	October 18, 2021
trouble with Tensorflow and TX2. Jetson TX2	1	1906	March 1, 2018
Cuda and tensorflow CUDA Developer Tools	0	1127	September 18, 2020
GPU support for tflite Jetson Nano cuda , tensorflow	8	5219	October 18, 2021
Tensorflow 2.3.1 + CUDA 10.1.105 + cuDNN 7.6.5.32 failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED cuDNN tensorflow	1	2306	November 20, 2020
Trying to implement GPU for Tensorflow using CUDA & cuDNN TensorRT cudnn	1	574	May 20, 2024

Tensorflow-gpu using high system memory, which is the bottleneck

Related topics