Output of batch inference

I’m trying to make my object detection on TensorRT program infer two images at the same time(batch inference). Inferred results from TensorRT are stored in an output array. When batchsize==1, the program works well; however, when I set batchsize==2, I can only get correct results of the first image, the results of the second image in the output array are zeros. The following is detailed description of my output result.

If I set batchSize=1, the size of output array is 136459, and I can get the predicted bounding boxes. When I set batchSize=2, the size of output array is 272918, I can get the predicted results of the first image in the indices of 0~136458. But the results of the second image, which should be starts from the 136059 index of the array, are zeros.

Snapshots of content of the output array:
https://drive.google.com/open?id=1ZOV4SInQVzLCG_n_-vwoVkeqrVzZfckX ; Results of the first image
https://drive.google.com/open?id=18YwUggoTjaEsS9WjAJ7QySCCp0heZna9 ; Results of the second image

Additional information:

  1. I’ve tried mTrtContext->execute() and mTrtContext->enqueue(), and the inferred results are the same.
  2. Inference time if batchSize == 1: 18.5816 ms
  3. Inference time if batchSize == 2: 35.7012 ms

I want to ask:

  1. If my input array containing two images in a batch is correct, do the outputs after inferencing contain bounding boxes of two images?
  2. Are the inferred result by using mTrtContext->execute() or mTrtContext->enqueue() guranteed correct?
  3. Is there size limitation of output when using batch inference in TensorRT, which means is the output array of size 272918 is too big for TensorRT?

I’ve encountered the same issue but with python API. Have you solved the problem? thank you and hope for your reply.

Are you making sure to adjust the binding dimensions?

context->setBindingDimensions(profile_idx, nvinfer1::Dims4(batch_size, c, h, w));

I’ve try in python api before execute_async() function as follows,

context.set_binding_shape(0,trt.tensorrt.Dims([images.shape[0], 112, 112, 3]))

but the outputs are the same. by the way, only the batch_size is variant.

Hi, ethanyhzhang:
I’ve solved the issue. The TensorRT program is implemented in C++ and the model contains a custom layer which had not implemented processing multiple images before.

Please check the author’s repository to see how to process multiple images in custom layers:
https://github.com/lewes6369/tensorRTWrapper/blob/0aaab5110d0794c7c374c7f46fbde2050b459556/code/src/YoloLayer.cu

The keyword is “batchSize” which stands for multiple input images.

Thanks!