Run two YOLOv3 models with CUDA Stream use TensorRT have a lot of cudaEventRecord

Dear all,

I have two YOLOv3 model use TensorRT and CUDA Stream to run in the same time, the platform is Pegasus, the profiling image as below:
https://imgur.com/jlkFPwY
as you can see in Stream24 and Stream25, in Stream 24 have a lot of empty in each Kernel(those kernel is TRT engine file to execute), and the empty also is cudaEventRecord function, but in my code, i didn’t use any about Event function, so how can i improve this problem??

Thank you~

Boyu

Could you please let us know if you are still facing this issue?

Thanks

Hi SunilJB,

I still have this issue, Does Drive AGX pegasus can run 2 TRT model in the same time?
I mean not like driveworks use CUDA Default stream, let 2 model sequence execute, I want let 2 TRT model execute in the same time, it is possible?

thanks.

Hi @boyuchen,

In order to run multiple model with TensorRT, i will recommend you to either use NVIDIA deepstream or NVIDIA Triton Inference Server.
Please refer below link for more details:


https://docs.nvidia.com/deeplearning/sdk/triton-inference-server-guide/docs/index.html

If you want to perform multi threading using TensorRT, please refer below link for best practices:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-best-practices/index.html#thread-safety 1

Thanks