Tensorrt multiple process

jhanvi · February 17, 2021, 9:34am

Description

I created tensorrt engine file of a model and created a context and did inference in python.
It works fine for single inference. Now I’m trying to load different contexts in same python script. Used multithreading module in python.
It performs single inference in 30 ms but takes 112 ms when using two different contexts at the same time using two different threads.
I’m trying to load two inference simultaneously , using two different contexts so that two inference performs in 30 ms in total but seems like its not performing as expected. Please let me know if there is any document where i can read more about multiple inference/ multiple contxts in single python script.

I also used multiprocessing module in python , but somehow it doesnt let me do context.push.
Let me know what approach i can use.
I have tried loading 4 python scripts in different terminal , doing inference and it works fine. I want to incorporate it in single python script

spolisetty · February 18, 2021, 6:04pm

Hi @jhanvi,

GPU can execute single context at a time. When you launch multiple contexts, it gets scheduled and causes increasing inference time. If you must use multiple processes, CUDA MPS may be useful.

Thank you.

xiebinghua123 · February 21, 2024, 8:15am

hello，I also want to use tensor rt engine to infer using multiple process。Can you share your code to show how to realize it？

Topic		Replies	Views
TensorRT Parallel Inference /concurrent inferecing TensorRT tensorrt	10	4091	October 13, 2022
Speeding up multi-threaded C++ program of TensorRT models TensorRT tensorrt	7	1353	February 20, 2025
How to use the multiple execution contexts of trt engine in parallel? How to properly use multi threading? TensorRT tensorrt	1	613	November 27, 2023
How to run inference in multithread( only Allocate host and device buffers once for all execution contexts) CUDA-GDB tensorrt , cuda	2	872	July 13, 2021
How to run inference in multithread( only Allocate host and device buffers once for all execution contexts) TensorRT tensorrt	1	460	July 14, 2021
how to run trt in multithreading？ Jetson TX2	15	7972	October 18, 2021
How to do two different inference with TensorRT on two different GPU on same machine or PC TensorRT	2	466	September 29, 2023
Latency when running TensorRT engine on two GPU TensorRT	9	1239	August 24, 2020
TensorRT Python Client Runtime Error TensorRT	6	1420	September 19, 2019
Parallel execution of several trt contexts on one GPU TensorRT onnx	1	1199	August 7, 2023

Tensorrt multiple process

Description

Related topics