Slow inference UNet Industrial TF-TRT

gabriel_lucas.oliveira · June 30, 2023, 7:57pm

Description

Hi,
I am new to NVIDIA tools, I am running the notebook example of training, test and export to TF-TRT UNet Industrial on DAGM dataset provided at UNet Industrial Inference Demo with TF-TRT.

I was able to do all the pipeline and exported the checkpoint to inference with TF-TRT. I am running the inference in one image with the following command:

# inference with save TF-TRT model
hvd.init()
config = tf.ConfigProto()
config.gpu_options.allow_growth=True

start = time()
with tf.Session(graph=tf.Graph(), config=config) as sess:
        tf.saved_model.loader.load(
            sess, [tf.saved_model.tag_constants.SERVING], SAVED_MODEL_DIR)
        nodes = [n.name for n in tf.get_default_graph().as_graph_def().node]
        #print(nodes)
        output = sess.run(["UNet_v1/sigmoid:0"], feed_dict={"input:0": img})
print(f'Time spent: {time() - start}')

It was supposed to be very fast (since that is the premise of doing inference on NGC Containers), but it takes 60~70 seconds to run the inference.

What can I do to speed up this time?
Is there another way to load and predict my .pb model?
The model is attached in zip file.
TR-TRT-model-FP32.zip (6.5 MB)

Environment

GPU Type: NVidia RTX A6000
Container: ```
docker build . --rm -t unet_industrial:latest

AakankshaS · July 2, 2023, 6:07pm

Hi,

Request you to share the model, script, profiler, and performance output if not shared already so that we can help you better.

Alternatively, you can try running your model with trtexec command.

While measuring the model performance, make sure you consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.
Please refer to the below links for more details:

Thanks!

Topic		Replies	Views
Slow first inference and very slow two models inference TensorRT	3	1249	August 2, 2022
Examples for porting from Tensorflow to TensorRT4 object detection inference TensorRT	4	2457	April 26, 2018
Inference time increases rapidly when set a high resolution input image TensorRT tensorrt , cuda , ubuntu	1	809	September 13, 2023
TF-TRTModel loading time is very slow TensorRT tensorrt , tensorflow	10	1056	September 1, 2023
P6000 TensorRT too slow and the serialized fp16-model size is not as expected TensorRT tensorrt	1	461	April 4, 2023
Tlt-infer is slow TAO Toolkit	13	830	October 12, 2021
Inference time using TF-TRT is the same as Native Tensorflow for Object Detection Models TensorRT tensorrt , tf-trt	4	1008	March 31, 2022
No performance improvement with TF-TRT optimization (ResNet50, DenseNet121) TensorRT	4	1090	June 15, 2020
How can I optimize Tensorflow models on windows OS? The TF models are saved in the SavedModel format TensorRT	1	312	December 13, 2021
optimizing tf-trt load time Jetson Nano	12	4175	October 15, 2021

Slow inference UNet Industrial TF-TRT

Description

Environment

Related topics