Multithread does not improve inference performance with tensorrt models

hoangtm.fami · May 7, 2021, 6:25pm

@AastaLLL
I am using two threads, each run one tensorrt model, however inference latency is approximately running two models serially. From my observation, both threads run concurrently, however, the time it takes to process each of the thread is double.
Detail Multithread tensorrt does not improve inference latency · Issue #1238 · NVIDIA/TensorRT · GitHub

NVES · May 10, 2021, 5:15am

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.

Thanks!

spolisetty · May 11, 2021, 5:52am

It depends on the model. If each model takes very little GPU resources, then multi-threading would have benefit. Please check gpu utilization.

Thank you.

Topic		Replies	Views
Tensorrt Threads affect each other during multithreaded inference TensorRT tensorrt	16	1401	September 6, 2024
Multi-model parallel inferencing TensorRT	1	378	March 31, 2023
Speeding up multi-threaded C++ program of TensorRT models TensorRT tensorrt	7	1353	February 20, 2025
Tensorrt multi gpu with multi threads TensorRT	1	1094	February 18, 2022
Latency when running TensorRT engine on two GPU TensorRT	9	1239	August 24, 2020
Optimal Trt inference using threads/processes for peoplenet model for Triton Inference Server - archived tensorrt , inference-server-triton , a100	1	1151	July 30, 2021
How to Thread the prediction function provided in NVIDIA / TensorRT IntroNotebooks? TensorRT tensorrt , cuda , jetson-inference	1	365	July 1, 2022
Inference Time When Using Multi Stream in TensorRT is Much Slower than a Single One TensorRT tensorrt	5	2484	March 30, 2023
Performance Comparison: Multiple CUDA Streams with Multiple TensorRT Models vs. Combining Multiple TensorRT Models TensorRT tensorrt , cuda	0	384	December 23, 2023
Parallel execution of several trt contexts on one GPU TensorRT onnx	1	1199	August 7, 2023