Run two YOLOv3 models with CUDA Stream use TensorRT have a lot of cudaEventRecord

boyuchen · November 7, 2019, 9:32am

Dear all,

I have two YOLOv3 model use TensorRT and CUDA Stream to run in the same time, the platform is Pegasus, the profiling image as below:
https://imgur.com/jlkFPwY
as you can see in Stream24 and Stream25, in Stream 24 have a lot of empty in each Kernel(those kernel is TRT engine file to execute), and the empty also is cudaEventRecord function, but in my code, i didn’t use any about Event function, so how can i improve this problem??

Thank you~

Boyu

SunilJB · May 12, 2020, 11:00am

Could you please let us know if you are still facing this issue?

Thanks

boyuchen · May 13, 2020, 12:28am

Hi SunilJB,

I still have this issue, Does Drive AGX pegasus can run 2 TRT model in the same time?
I mean not like driveworks use CUDA Default stream, let 2 model sequence execute, I want let 2 TRT model execute in the same time, it is possible?

thanks.

SunilJB · May 13, 2020, 7:38am

Hi @boyuchen,

In order to run multiple model with TensorRT, i will recommend you to either use NVIDIA deepstream or NVIDIA Triton Inference Server.
Please refer below link for more details:

If you want to perform multi threading using TensorRT, please refer below link for best practices:
Best Practices For TensorRT Performance :: NVIDIA Deep Learning SDK Documentation 1

Thanks