TF-TRT model very slow to load, with poor performance

beligonmathieu · February 6, 2021, 11:52pm

Hi !

I am trying to run the latest models from Tensorflow Detection Zoo 2 on a Jetson Xavier NX models/tf2_detection_zoo.md at master · tensorflow/models · GitHub

I tried adapt from this blog: Accelerating Inference In TF-TRT User Guide :: NVIDIA Deep Learning Frameworks Documentation to run a MobilenetV2

I have 2 issues:

it takes about 25min to get the model ready to run in the inference script, I’d like to have this load time lower than 5min max
the FPS is really low (around 6FPS)

I downloaded ssd_mobilenet_v2_320x320_coco17_tpu-8 from models/tf2_detection_zoo.md at master · tensorflow/models · GitHub, and unzipped it in ./coco_models

Here is my script to convert from TF to TF-TRT:

from pathlib import Path

from numpy import uint8
from numpy.random.mtrand import normal
from tensorflow.python.compiler.tensorrt.test.tf_trt_integration_test_base import FP16
from tensorflow.python.compiler.tensorrt.trt_convert import TrtConversionParams, TrtGraphConverterV2


def convert(model_directory: Path):
    converter = TrtGraphConverterV2(
        input_saved_model_dir=str(model_directory / "saved_model"),
        conversion_params=TrtConversionParams(precision_mode=FP16, max_workspace_size_bytes=1 << 32),
    )
    converter.convert()

    def fake_inputs():
        yield normal(size=(1, 1_280, 720, 3)).astype(uint8),

    converter.build(input_fn=fake_inputs)
    converter.save(str(model_directory / "trt"))


if __name__ == "__main__":
    convert(Path("coco_models/ssd_mobilenet_v2_320x320_coco17_tpu-8"))

And the script to infer from the model:

from time import time

import cv2
from numpy import expand_dims
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
from tensorflow.python.framework.ops import convert_to_tensor
from tensorflow.python.saved_model import saved_model
from tensorflow.python.saved_model.signature_constants import DEFAULT_SERVING_SIGNATURE_DEF_KEY
from tensorflow.python.saved_model.tag_constants import SERVING


saved_model_loaded = saved_model.load("coco_models/ssd_mobilenet_v2_320x320_coco17_tpu-8/trt", tags=[SERVING])
graph_func = saved_model_loaded.signatures[DEFAULT_SERVING_SIGNATURE_DEF_KEY]
frozen_func = convert_variables_to_constants_v2(graph_func)


def demo_object_detection():
    cap = cv2.VideoCapture(
        "nvarguscamerasrc ! "
        "video/x-raw(memory:NVMM), "
        f"width=(int)1280, height=(int)720, "
        "format=(string)NV12, framerate=(fraction)60/1 ! "
        "nvvidconv flip-method=0 ! "
        f"video/x-raw, width=(int)1280, height=(int)720, "
        "format=(string)BGRx ! "
        "videoconvert ! appsink",
        cv2.CAP_GSTREAMER,
    )

    while True:
        ret, image = cap.read()

        if ret is None:
            break

        t = time()

        # Detection
        _, [boxes], [ids], _, [scores], *_ = (
            x.numpy() for x in frozen_func(convert_to_tensor(expand_dims(image, axis=0)))
        )
        print(f"FPS: {1 / (time() - t):.1f}")

        # display
        for box, class_id, score in zip(boxes, ids, scores):
            if score > 0.5:
                cv2.rectangle(
                    image,
                    (
                        int(box[1] * image.shape[1]),
                        int(box[0] * image.shape[0]),
                    ),
                    (
                        int(box[3] * image.shape[1]),
                        int(box[2] * image.shape[0]),
                    ),
                    (1.0, 1.0, 1.0),
                    2,
                )
        cv2.imshow("object detection", image)

        if cv2.waitKey(1) & 0xFF == ord("q"):
            break

    # When everything done, release the capture
    cap.release()

    cv2.destroyAllWindows()


if __name__ == "__main__":
    demo_object_detection()

Is there something that I’m doing wrongly ?

Many thanks !

AastaLLL · February 8, 2021, 5:43am

Hi,

You can serialize the TF-TRT model after the first launch.
So next time you can load the model with the file directly to save time.

For inference, this depends on the model architecture itself.
Is your model a modified version of ssd_mobilenet_v2?

If yes, you can inference it with pure TensorRT as below:
This will give you a much better performance.

Thanks.

beligonmathieu · February 8, 2021, 9:08pm

Thanks a lot for your answer AastaLLL !

If I understand correctly, serializing the model would only improve the loading time, not the inference time, is it correct ?

If so, I’d like to go with pure TensorRT. I did successfully managed to convert a ssd_mobilenet_v2from Tensorflow 1 into a TensorRT model, and to run it at 60+ FPS.

However, I’m trying to run some models from the Tensorflow Zoo 2 (models/tf2_detection_zoo.md at master · tensorflow/models · GitHub). My example is using ssd_mobilenet_v2, but some of the models (CenterNet and EfficientNet for eg) seems to be some very promising models, and I’d like to try them on the Jetson eventually. Those models are only available pretrained on TF2, so I need a way to convert a TF2 model to TensorRT.

Do you have some links or resources to achieve that ?

Many thanks !

AastaLLL · February 23, 2021, 7:46am

Hi,

YES.
Launch TensorRT with engine will only save the conversion time.
Inference time will be the same.

For TensorFlow v2.x based model, please first convert it into ONNX format.
There are some public tool can do this. ex. tf2onnx.

Once you get an ONNX model, you can launch TensorRT with the file directly.

/usr/src/tensorrt/bin/trtexec --onnx=[your/file/path]

Thanks.

alex283hl · July 12, 2021, 10:06am

Thank you for your response!I have the same problem. When I converted my model in trt.TrtPrecisionMode.FP32 mode and use tf.keras.models.load_model for the next using this model I obtained very slow loading time. Could you explain me how I can serialize the TF-TRT model after the first launch? Must I save it in using some special command/utils? For example, if I convert my TF TRTmodel to ONNX format will I able to use it in my scrip with quick loading?

kayccc · July 21, 2021, 5:39am

Hi alex283hl,

Please help to open a new topic for your issue. Thanks

Topic		Replies	Views
TF-TRTModel loading time is very slow TensorRT tensorrt , tensorflow	10	1068	September 1, 2023
Create engine usint TF 2.x Jetson AGX Xavier tensorflow	4	725	October 17, 2021
After converting ssdMobilnet from the examples, the model is slower Jetson Xavier NX tensorrt	4	504	October 18, 2021
Loading TensorRT model is very slow on Jetson Nano Jetson Nano tensorrt , tensorflow , jetson-inference , python	5	2657	October 15, 2021
No performance improvement for Tensorflow TensorRT model on converted on Jetsons Xavier NX Jetson Xavier NX tensorrt , tensorflow	2	684	October 18, 2021
Slow first inference and very slow two models inference TensorRT	3	1262	August 2, 2022
Inference time? Jetson Xavier NX jetson-inference	3	475	October 10, 2021
TF-TRT does not speed up the model Jetson AGX Xavier tensorrt , tensorflow	4	1368	September 5, 2021
How to convert a trained model to TensorRT for inference? Jetson AGX Xavier	8	2584	October 18, 2021
How Can I Convert Tensorflow Object Detection Model (SSD-Mobilenet) to TensorRT model for inference on Jetson TX1? TensorRT	4	1049	March 29, 2023

TF-TRT model very slow to load, with poor performance

Related topics