Multi-process running tensorRT

xidiantuoersuo · September 16, 2020, 6:51am

hi, i want the two processes use tensorRT to load different models,but xavier does not support mps.
So is there any way to reduce the time consuming caused by process switching?

AastaLLL · September 16, 2020, 7:10am

Hi,

YES. Please create two CUDA context for each engine(model) and run it with multi-thread.
You can find a similar implementation in our trtexec app.

/usr/src/tensorrt/samples/common/sampleInference.cpp

Thanks.

xidiantuoersuo · September 17, 2020, 3:46am

Thanks for your reply.

xidiantuoersuo · September 17, 2020, 10:22am

hi,I try to use multi-threaded approach,I found the following phenomenon

A few times, the time-consuming will increase greatly,Mostly normal(vgg infere:6.9ms, resnet50 infere:2.9ms)

[resnet50][info][PreProcess][cost time] :0.092ms
[vgg16][info][PreProcess][cost time] :0.369ms
[vgg16][info][Infere][cost time] :8.073ms
[vgg16][info][Post][cost time] :1.049ms
[resnet50][info][Infere][cost time] :9.606ms
[resnet50][info][Post][cost time] :0.032ms
[vgg16][info][PreProcess][cost time] :0.093ms
[resnet50][info][PreProcess][cost time] :0.094ms
[vgg16][info][Infere][cost time] :8.319ms
[vgg16][info][Post][cost time] :1.166ms
[resnet50][info][Infere][cost time] :9.564ms
[resnet50][info][Post][cost time] :0.032ms
[vgg16][info][PreProcess][cost time] :0.09ms
[resnet50][info][PreProcess][cost time] :0.47ms
[vgg16][info][Infere][cost time] :8.127ms
[vgg16][info][Post][cost time] :1.32ms
[resnet50][info][Infere][cost time] :9.051ms
[resnet50][info][Post][cost time] :0.031ms
[vgg16][info][PreProcess][cost time] :0.09ms
[resnet50][info][PreProcess][cost time] :0.503ms
[vgg16][info][Infere][cost time] :8.23ms

AastaLLL · September 29, 2020, 6:11am

Hi,

Here are two possible causes for your reference.

1.
The default clock mode is dynamic and will cause the unstable performance.
Please maximize the device performance first.

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2.
Would you mind to check if there is any non-release memory inside your app?
This can be verified with tegrastats directly.

$ sudo tegrastats

Thanks.

Topic		Replies	Views
how to run trt in multithreading？ Jetson TX2	15	8129	October 18, 2021
TensorRT unnecessary synchronization in multi-GPU system TensorRT tensorrt , performance , synchronization	7	1511	January 23, 2023
Running Multiple TensorRT Engines On Jetson AGX Jetson AGX Xavier tensorrt , yolo	2	2541	August 29, 2021
Invoking Tensorrt Model on Jetson Xavier with threads performs slower than invoking in serial manner Jetson Xavier NX tensorrt , cuda	2	545	October 18, 2021
How to use TensorRT by the multi-threading package of python Jetson AGX Xavier tensorrt	13	19158	October 18, 2021
[TensorRT] Speed of concurrent execute multiple TensorRT model on one GPU TensorRT tensorrt	1	1838	May 24, 2020
Running two models in multiple models increases the FPS TensorRT tensorrt , cuda , python	1	1463	October 28, 2020
What is the best way to run multiple TRT threads on multiple GPU with each context process same video frame? TensorRT	0	702	June 17, 2019
Multi tensorrt task issue Jetson AGX Xavier tensorrt	3	370	October 18, 2021
Concurrently run two or more engine on a tensorrt TensorRT	4	2433	April 14, 2020

Multi-process running tensorRT

Related topics