Multiple different engine inference simoultaneously with TensorRt c++

mostafajalalhoseiny · October 15, 2024, 12:39pm

Multiple different engine inference simoultaneously with TensorRt c++

I wrapped GitHub - cyrusbehr/tensorrt-cpp-api: TensorRT C++ API Tutorial as a Classifier class, and I want to use it with different ONNX models and an engine pool design across different threads. However, I observed that TensorRT works serially with the GPU, not concurrently. Is there a solution for this in TensorRT, or should I use Triton for this purpose?

Environment

TensorRT Version: 8.6
GPU Type: 3060 and 4060
Nvidia Driver Version: 550
CUDA Version: 11.8
CUDNN Version: 8.7
Operating System + Version: ubuntu 18+

Topic		Replies	Views
Outputs of tensorrt are too different according to the compute capabilities TensorRT	1	430	November 2, 2022
Tensorrt Threads affect each other during multithreaded inference TensorRT tensorrt	16	1380	September 6, 2024
TensorRT Concurrent inference in C++ TensorRT cudnn	4	609	February 6, 2024
A question about using TensorRT in C++ TensorRT tensorrt , ubuntu , cudnn , computer-vision	0	40	September 13, 2024
How to do two different inference with TensorRT on two different GPU on same machine or PC TensorRT	2	465	September 29, 2023
Parallel execution of several trt contexts on one GPU TensorRT onnx	1	1170	August 7, 2023
Concurrent tensorRT engines TensorRT jetson	1	394	December 5, 2022
Multi Stream in TensorRT TensorRT	1	2096	July 28, 2020
How to inference with tensorrt on multi gpus in python TensorRT	2	2145	April 9, 2021
Multiple context and/or multithreading TensorRT	1	1259	March 24, 2022

Multiple different engine inference simoultaneously with TensorRt c++

Multiple different engine inference simoultaneously with TensorRt c++

Environment

Related topics