How to run TensorRT based deep learning model as real time?

hibestil · March 25, 2019, 6:45am

I optimized my deep learning model with TensorRT. A C++ interface is inferencing images by optimized model on Jetson TX2. This interface is providing average 60 FPS (But it is not stable. Inferences are in range 50 and 160 FPS). I need to run this system as real time on RTOS patched Jetson TX2.

So what is your thoughts on real time inference with TensorRT? Is it possible to develop real time inferencing system with TensorRT and how?

I have tried set high priorities to process and threads to provide preemption. I expect appoximatly same FPS value on every inference. So I need deterministic inference time. But system could not output deterministicaly. Maybe TensorRT is not suitable for real time.

Thanks.

NVES · March 25, 2019, 2:58pm

Hello,

TRT inference times are expected to be fairly deterministic, even when running on non-realtime OS. Can you provide more details on your “RTOS patched Jetson TX2”? Also, any details on your inference workflow/infrastructure, maybe even a debug will help us debug.

hibestil · March 25, 2019, 6:59pm

Hello,

Thank you for your fast response. I’m sorry to use “RTOS patched” expression instead “Real time patched”. To patch Jetson TX2, I used kozyilmaz’s[1] guide. After patching, Jetson passed successfully real-time tests.

You’ve said “even when running on non-realtime OS” but when running on non-real time Jetson fps results also non-deterministic. And non-real time Jetson TX2 is providing higher FPS than real time patched Jetson.

My inference infrastructure is a modified version of sampleUffMnist from TensorRT Samples [2]. And this figure [3] shows my model’s layers. Only difference is that I am using resize layer instead flatten layer because of TensorRT supported ops.

Thanks…

[1] . [url]https://github.com/kozyilmaz/nvidia-jetson-rt/blob/master/docs/README.03-realtime.md[/url]
[2] . [url]https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#mnist_uff_sample[/url]
[3] . [url]https://pasteboard.co/I752Wby.png[/url]

hibestil · June 26, 2019, 9:00am

I just found “Persistent Threads”[1] topic. Concurrent RT says :

"The use of the persistent threads style can improve determinism significantly,
making modest sized workloads viable for such applications. 
The persistent threads model avoids these determinism problems by launching a CUDA kernel only once,
at the start of the application, and causing it to run until the application ends."

But I can not find any examples about persistent threading with TensorRT on Jetson TX2. Has anyone try out this method?

[1]. https://www.concurrent-rt.com/wp-content/uploads/2016/09/Improving-Real-Time-Performance-With-CUDA-Persistent-Threads.pdf

hibestil · July 18, 2019, 1:33pm

I read this comment at Nvidia Devtalk: “If your code uses floating-point atomics, results may differ from run to run because floating-point operations are generally not associative, and the order in which data enters a computation (e.g. a sum) is non-deterministic when atomics are used.”.

I used fp16 precision type when I was optimizing model with TensorRT. Is it possible to get deterministic output, when we use fp16 precision? Any thoughts?

Topic		Replies	Views
Is TensorRT “floating-point 16 precision mode” non-deterministic on Jetson TX2? Jetson TX2	6	1551	October 18, 2021
Randomness at inference time Jetson Nano jetson-inference	6	674	May 19, 2022
Inference Time is not stable TensorRT	10	1880	January 3, 2019
Slow inference using tensorrt sampleFasterRCNN, 320ms/frame Jetson TX2	5	1509	October 18, 2021
TensorRT inference is slower than tensorflow model TensorRT	1	1006	June 28, 2019
Slow inference on jetson TX2 with tensorflow Jetson TX2	2	657	October 18, 2021
Is it possible to run Faster-RCNN in TensorRT with Jetson-TX2 at real-time FPS? Jetson TX2	2	963	November 29, 2018
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1342	August 27, 2020
Inference time not stable for Jetson Nano with TensorRT Jetson Nano tensorrt	4	758	October 18, 2021
TensorRT Optimization for Tensorflow-Unet-Image-segmentation TensorRT tensorrt , tensorflow , nano	1	1233	August 4, 2021

How to run TensorRT based deep learning model as real time?

Related topics