Detection tesnorRT takes seconds to run on TX2

alex73 · June 23, 2020, 12:03am

I’m using a detection model that has been converted to TensorRT on a TX2.
A few days ago, it started to run extremely slow -

original speed ~40[msec]
now over 2 [sec] or slower

I tried to restart the TX2 - that didn’t help.

I tried to use a different model or use the same one but with a different name.
At first it helped, but now run extremely slowly every time.

Any help is appreaciated
Alex

Details:
* Jetpack: 4.2.2 [L4T 32.2.1]
* CUDA: 10.0.326
* cuDNN: 7.5.0.56-1+cuda10.0
* TensorRT: 5.1.6.1-1+cuda10.0
* OpenCV: 3.4.6 compiled CUDA: YES

AakankshaS · June 23, 2020, 5:00am

Hello,
Moving this to Jetson team so that the team can have a look and help you better.
Thanks!

alex73 · June 23, 2020, 5:14am

Thanks @AakankshaS

AastaLLL · June 23, 2020, 7:03am

Hi,

Do you install any package or adjust the clock recently?
Suppose you should get the similar speed if the software stay the same.

You can also try to maximize the system performance to see if helps.

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks.

alex73 · June 24, 2020, 12:51am

thanks @AastaLLL - I have done that.

as part of the initialisation step, I apply a single inference prior to really activating the system.
after this inference, sometimes, the used RAM is much higher than it should be.
most cases it would be around:
RAM = 6720/7860 (from jopt() stat)
but at times I measure:
RAM = 7614/7860 (from jopt() stat)
or higher.
if this happens, I get inference times that are 1[sec] or 7[sec] or more.

any idea why I get such differences in the RAM usage after first inference (this is a unit test, so same image and data each time, same code of course)?
Would there be something else to measure / restrict?

Thanks

alex73 · July 8, 2020, 5:54am

@AastaLLL

I can understand that if the memory is full, inference would take a long time.

There are 3 models I have been using - all have been converted to tensorRT using the TF-TRT approach (as pointed out in Should pruning a model prior to converting it to tensorRT make inference faster? - #7 by AastaLLL).
for each of the models, i have defined the memory usage as
max_workspace_size_bytes = 1 << 26

these are the RAM usages:

after loading model 1: RAM = 2224/7860, swap = 600/4096
after loading model 2: RAM = 2322/7860, swap = 600/4096
after loading model 3: RAM = 2462/7860, swap = 600/4096

after applying them for the first time, the RAM usage jumps to ~7000/7860
should it be that high?

AastaLLL · July 8, 2020, 8:37am

Hi,

We will recommend you to use pure TensorRT for inference instead.
TF-TRT use TensorFlow interface so by default it occupies most of GPU memory to allow fast algorithm.

You can try to export model .pb->.uff(.onnx)->.trt to get a better performance with pure TensorRT.
Here is an example for your reference:

/usr/src/tensorrt/samples/sampleUffSSD/

Thanks.

alex73 · July 9, 2020, 6:04am

Thanks @AastaLLL
will try it shortly

Topic		Replies	Views
Slow inference on jetson TX2 with tensorflow Jetson TX2	2	599	October 18, 2021
TensorRT model consuming more amount of RAM on Jetson TX2 Jetson TX2 tensorrt	5	1084	October 18, 2021
Jetson TX2 working at full capacity for a object detection model Inference Jetson TX2 jetson-inference	4	794	March 9, 2022
Object detection models are very slow Jetson TX2	5	1462	October 18, 2021
TensorRT inference Time TensorRT	1	759	September 20, 2018
TensorRT model consuming more amount of RAM Jetson TX2 tensorrt	3	886	October 18, 2021
Slow inference using tensorrt sampleFasterRCNN, 320ms/frame Jetson TX2	5	1398	October 18, 2021
Low Compute utilization of converted TensorFlow model during inference Jetson TX2	19	1695	October 18, 2021
Inference slow using nvInfer and TensorRT directly into PX2 General	6	754	April 17, 2019
Inference is so slow with torch1.6 Jetson Xavier NX nvbugs , pytorch	12	3538	October 23, 2020

Detection tesnorRT takes seconds to run on TX2

Related topics