Multiple concurrent Execution Contexts?

CAT · May 15, 2020, 9:22am

I’m working on a project where I have to process multiple images (either 2 or 4) as fast as possible in a CNN.
At the moment I’m just batching the images and and doing batch-inference in a single IExecutionContext.
I was wondering if it would be possible to create independent Execution Contexts and run them concurrently.
So my Questions are:

Instead of batching, can I create multiple Execution Contexts from the same engine?
2.) Will they run concurrently on a single GPU if I use them from seperate threads?
3.) Will this be faster than batching the inputs?

Thank you for your answers.

SunilJB · May 15, 2020, 5:56pm

Hi,
Batching will give you higher throughput.
Increase batch to fully use GPU resource for better performance.

Thanks

CAT · May 18, 2020, 6:14am

Thank you for your reply.
Just out of curiosity: From your answer I gather that two Execution Contexts wouldn’t actually run in parallel on the GPU?

SunilJB · May 18, 2020, 6:27am

I will run in parallel and yes, it can have some benefit.
But increase batch to fully use GPU resource might have better performance.

Thanks

CAT · May 18, 2020, 6:46am

Thank you.
I was just a little confused while batching the input images.
I am experiencing almost no benefit regarding inference time while batching.
For example; When I use batchsize 1 for just a single image my network (A pretty big CNN) will take about 5ms,
In contrast to that, when I use batchsize 4 it will take about 17ms. I was under the impression that batching 4 images wouldn’t take that much longer than inference on a single image.

1611074199 · June 8, 2021, 3:08am

Hello, I want to know how can it run concurrently

RezaSa · February 14, 2022, 2:48pm

Hi, CAT.
I’m trying to get batch_inferences from my engine, and I am wondering how I can do it.
I’m having a CNN model for object detection, so my input size is fixed, and I was wondering how I could augment my GPU 's workload in any ways possible.
Thanks.

Topic		Replies	Views
Batching vs CUDA Streams for concurrent inferences? TensorRT tensorrt , cuda	7	1757	October 12, 2021
TensorRT Parallel Inference /concurrent inferecing TensorRT tensorrt	10	3831	October 13, 2022
Speeding up multi-threaded C++ program of TensorRT models TensorRT tensorrt	6	1210	August 2, 2024
Parallel execution of several trt contexts on one GPU TensorRT onnx	1	1060	August 7, 2023
Inference Time When Using Multi Stream in TensorRT is Much Slower than a Single One TensorRT tensorrt	5	2412	March 30, 2023
Is TensorRT safe to create engine & context in one thread, and execute in another thread? TensorRT	1	665	June 5, 2022
Batch inference parallelization on tensorrt DeepStream SDK tensorrt	2	474	October 12, 2021
Is multi threaded execution possible with tensorRT? TensorRT	3	2209	April 13, 2020
Tensorrt Threads affect each other during multithreaded inference TensorRT tensorrt	16	1279	September 6, 2024
TensorRT Concurrent inference in C++ TensorRT cudnn	4	572	February 6, 2024

Multiple concurrent Execution Contexts?

Related topics