cuBLAS, cuDNN, and TensorRT memory release on Jetson nano

Nicolas_zhaolaosi · October 26, 2021, 2:09am

I’m using TensorRT inference on jetson nano 2GB board, only the device memory allocated by the TensorRT allocator will be released by calling .destroy() . cuDNN, cuBLAS memory would not release. I found some similar topics: CUDA memory release, GPU memory may leak during deserializing the engine on TensorRT 6, but it looks like can not release before the application is terminated.

Due to we need to call the application continuously and 2GB memory is so limited, so how to release cuDNN, cuBLAS memory without terminating the application?

AastaLLL · October 26, 2021, 2:52am

Hi,

The memory is used for loading the cuDNN/cuBLAS library.
If you are using TensorRT 8.0 (JetPack 4.6), an alternative is to inference the model without using cuDNN.

For example:

$ /usr/src/tensorrt/bin/trtexec --deploy=mnist.prototxt --model=mnist.caffemodel --output=prob --tacticSources=-cudnn --verbose

Thanks.

Nicolas_zhaolaosi · October 26, 2021, 6:25am

Thanks for your reply！
It works but seems like not an elegant way for continuous inference

Nicolas_zhaolaosi · October 26, 2021, 10:11am

small question: I convert .onnx model to .trt model using trtexec with --tacticSources=-CUDNN,-CUBLAS, I found inference time looks not increased and the result is correct. Are CUDNN and CUBLAS necessary for inference? any difference?

AastaLLL · November 1, 2021, 3:34am

Hi,

It depends on the layer you used.

Below is the explanation from our document:
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#memory-runtime-phase

TensorRT’s dependencies (cuDNN and cuBLAS) can occupy large amounts of device memory. TensorRT allows you to control whether these libraries are used for inference via the TacticSources (C++, Python) attribute in the builder configuration. Note that some operator implementations require these libraries, so that when they’re excluded, the network may not compile.

Thanks.

system · November 24, 2021, 3:25am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Lowering tensorrt memory usage Jetson TX2 tensorrt	4	605	May 16, 2023
TensorRT used lots of memory when loading model files Jetson Orin NX tensorrt	6	1239	May 31, 2023
Expected Tensor RT 8 RAM Usage Jetson TX2 tensorrt	2	522	March 2, 2022
More information about "Tactic Sources" TensorRT	4	1868	April 29, 2023
Tensorflow-gpu using high system memory, which is the bottleneck Jetson TX2 cuda , tensorflow	4	647	October 18, 2021
How can I release GPU memory without terminating the execution process TensorRT tensorrt , python	2	1733	June 10, 2022
CUDA memory release Jetson Nano	14	6116	October 14, 2021
How to release all gpu memory after saving built engine? TensorRT tensorrt	1	519	July 19, 2022
GPU vs CPU deep learning memory usage Jetson Nano cudnn	5	720	March 26, 2024
TensorRT CPU Memory Management TensorRT jetson-inference , jetson	5	1697	July 7, 2022

cuBLAS, cuDNN, and TensorRT memory release on Jetson nano

Related topics