Tensorrt Engine use too much memory

Robert_Hoang · December 11, 2021, 9:45am

Environment

TensorRT Version: 7.1.3
GPU Type: Jetson nano
CUDA Version: 10.2
Python Version (if applicable): 3.6
TensorFlow Version (if applicable): 1.15.4

Steps To Reproduce

After converting our models from onnx to tensorrt engine, I ran them on jetson nano. Howerver, jetson nano has limited memory.
So i decide to set config.max_workspace_size = 128 * 1024 * 1024. But, for 1 model (model size < 20MB), it almost use 1.4GB memory.
How to reduce memory usage?


TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
def _build_engine_onnx(input_onnx: Union[str, bytes], force_fp16: bool = False, max_batch_size: int = 1):
    with trt.Builder(TRT_LOGGER) as builder, \
            builder.create_network(EXPLICIT_BATCH) as network, \
            builder.create_builder_config() as config, \
            trt.OnnxParser(network, TRT_LOGGER) as parser:

        if force_fp16 is True:
            logging.info('Building TensorRT engine with FP16 support.')
            has_fp16 = builder.platform_has_fast_fp16
            if not has_fp16:
                logging.warning('Builder report no fast FP16 support. Performance drop expected')
            config.set_flag(trt.BuilderFlag.FP16)
            config.set_flag(trt.BuilderFlag.STRICT_TYPES)

        config.max_workspace_size =128 * 1024 * 1024

        if not parser.parse(input_onnx):
            print('ERROR: Failed to parse the ONNX')
            for error in range(parser.num_errors):
                print(parser.get_error(error))
            sys.exit(1)

        if max_batch_size != 1:
            logging.warning('Batch size !=1 is used. Ensure your inference code supports it.')
            profile = builder.create_optimization_profile()
            # Get input name and shape for building optimization profile
            input = network.get_input(0)
            im_size = input.shape[2:]
            input_name = input.name
            profile.set_shape(input_name, (1, 3) + im_size, (1, 3) + im_size, (max_batch_size, 3) + im_size)
            config.add_optimization_profile(profile)

        return builder.build_engine(network, config=config)

NVES · December 13, 2021, 12:32pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Topic		Replies	Views
Make full use of the swapfile Jetson Nano tensorrt	6	877	June 2, 2023
High RAM consumption with CUDA and TensorRT on Jetson Xavier NX Jetson Xavier NX tensorrt	10	2810	October 18, 2021
Converting FCN8-ResNet18 from Pytorch to TensorRT for inference on Jetson Nano TensorRT tensorrt , jetson-inference , pytorch , python , onnx	3	2249	October 12, 2021
Infer time after conversion and ram usage TensorRT tensorrt	12	1027	February 15, 2022
ONNX model and TensorRT engine works differently TensorRT	5	723	February 20, 2023
Converting tf model on jetson tx2 is slow TensorRT	14	1215	June 24, 2020
How to set workspace in Tensorrt Python API when converting from onnx to engine model TensorRT	5	1193	June 29, 2023
The same model consumes different sizes of GPU memory in different GPU TensorRT	8	1707	August 8, 2022
How does TensorRT use host memory (RAM) at runtime? TensorRT tensorrt , onnx	3	1767	August 3, 2023
Creating a TensorRT Engine with different batch sizes TensorRT python , onnx	12	2778	August 18, 2020

Tensorrt Engine use too much memory

Environment

Steps To Reproduce

check_model.py

Related topics