Dear all,
I have two YOLOv3 model use TensorRT and CUDA Stream to run in the same time, the platform is Pegasus, the profiling image as below:
https://imgur.com/jlkFPwY
as you can see in Stream24 and Stream25, in Stream 24 have a lot of empty in each Kernel(those kernel is TRT engine file to execute), and the empty also is cudaEventRecord function, but in my code, i didn’t use any about Event function, so how can i improve this problem??
Thank you~
Boyu
Could you please let us know if you are still facing this issue?
Thanks
Hi SunilJB,
I still have this issue, Does Drive AGX pegasus can run 2 TRT model in the same time?
I mean not like driveworks use CUDA Default stream, let 2 model sequence execute, I want let 2 TRT model execute in the same time, it is possible?
thanks.
Hi @boyuchen,
In order to run multiple model with TensorRT, i will recommend you to either use NVIDIA deepstream or NVIDIA Triton Inference Server.
Please refer below link for more details:
https://docs.nvidia.com/deeplearning/sdk/triton-inference-server-guide/docs/index.html
If you want to perform multi threading using TensorRT, please refer below link for best practices:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-best-practices/index.html#thread-safety 1
Thanks