TensorRT ------ maxBatchSize & batchSize ------ kFLOAT & kHALF ------ sampleUffMNIST.cpp

Hi,

  1. If I have a LeNet model in Tensorflow, I can set different batch sizes for training and testing/inferencing.

I want to run the “sampleUffMNIST.cpp” on a Jetson TX2 with the “lenet5.uff”. How do I change the batch size for inferencing ?

I think maxBatchSize is for allocating required memory. And I am not able to change batchSize!

  1. I do not see any significant difference in the average time taken on changing (nvinfer1) kFLOAT & kHALF with the above files and setup. It is the same effect if I run more number of images. Is there anything that I am missing?

Thank you.

Hi,

  1. Batch information is set here: context->execute(batchSize, &buffers[0])
  • setMaxBatchSize is for creating the TensorRT engine.
  • The batch parameter in execute() is the inference number when runtime.
  1. FP16 cuts memory in half. But it does not always double the performance.
    The time to process a specific layer (Ex. IP layer) may be longer in FP16 mode.
    It is encouraged to compare the performance between FP16 and float.

Thanks.

Hi,

Thank you for your response.

I will check FP16 (kHALF) performance on other models.

Regrading the batchSize:
I had already tried changing the batchSize in that location.
In the “void execute(ICudaEngine& engine)” function, I changed “int batchSize = 1” to “int batchSize = 128” or other values.
This creates an error in the “void* createMnistCudaBuffer” function. This “assert(eltCount == INPUT_H * INPUT_W)” assert fails.
Because in the “calculateBindingBufferSizes” function “eltCount = volume(dims) * batchSize”.
eltCount changes for different batch sizes but the assert statement requires it to be constant with respect to the image dimensions.

How do I overcome this and proceed from here ?

Thank you.

Hi,

If you want to inference batch=128 images at a time, please also prepare N=128 input and output buffer.

Input dimension: NxHxWxC
Output dimension: NxClass

Thanks.