I’m working on a project where I have to process multiple images (either 2 or 4) as fast as possible in a CNN.
At the moment I’m just batching the images and and doing batch-inference in a single IExecutionContext.
I was wondering if it would be possible to create independent Execution Contexts and run them concurrently.
So my Questions are:
Instead of batching, can I create multiple Execution Contexts from the same engine?
2.) Will they run concurrently on a single GPU if I use them from seperate threads?
3.) Will this be faster than batching the inputs?
Thank you.
I was just a little confused while batching the input images.
I am experiencing almost no benefit regarding inference time while batching.
For example; When I use batchsize 1 for just a single image my network (A pretty big CNN) will take about 5ms,
In contrast to that, when I use batchsize 4 it will take about 17ms. I was under the impression that batching 4 images wouldn’t take that much longer than inference on a single image.
Hi, CAT.
I’m trying to get batch_inferences from my engine, and I am wondering how I can do it.
I’m having a CNN model for object detection, so my input size is fixed, and I was wondering how I could augment my GPU 's workload in any ways possible.
Thanks.