Run two YOLOv3 models with CUDA Stream use TensorRT have a lot of cudaEventRecord

Dear all,

I have two YOLOv3 model use TensorRT and CUDA Stream to run in the same time, the platform is Pegasus, the profiling image as below:
https://imgur.com/jlkFPwY
as you can see in Stream24 and Stream25, in Stream 24 have a lot of empty in each Kernel(those kernel is TRT engine file to execute), and the empty also is cudaEventRecord function, but in my code, i didn’t use any about Event function, so how can i improve this problem??

Thank you~

Boyu

Could you please let us know if you are still facing this issue?

Thanks

Hi SunilJB,

I still have this issue, Does Drive AGX pegasus can run 2 TRT model in the same time?
I mean not like driveworks use CUDA Default stream, let 2 model sequence execute, I want let 2 TRT model execute in the same time, it is possible?

thanks.

Hi @boyuchen,

In order to run multiple model with TensorRT, i will recommend you to either use NVIDIA deepstream or NVIDIA Triton Inference Server.
Please refer below link for more details:

If you want to perform multi threading using TensorRT, please refer below link for best practices:
Best Practices For TensorRT Performance :: NVIDIA Deep Learning SDK Documentation 1

Thanks