So far I have been working with a neural network using a single batch size in the C++ API. Additionally, due to the way I am manipulating data in the rest of my program, I already have data in the GPU and am therefore not performing any host to device copies. My program structure is as follows:
int inputIndex = mEngine->getBindingIndex("Input"); int outputIndex = mEngine->getBindingIndex("Output"); assert(inputIndex == 0 || inputIndex == 1); assert(outputIndex == 0 || outputIndex == 1); // Double check that the layers above are really input and output assert(mEngine->bindingIsInput(inputIndex) == true); assert(mEngine->bindingIsInput(outputIndex) == false); void* buf; buf[inputIndex] = gpuTensorInput; // gpuTensorInput is a void* buf[outputIndex] = tensorOutput; // tensorOutput is a void* bool status = context->executeV2(buf);
Right now, I am feeding in a gpu pointer for the input and another gpu pointer for the output, for a single batch. buf feeds these pointers into the executeV2 method to run inference on the network. This is working fine.
Now my question is what if I would like to increase the batch size? I will always do fixed size inference (no need to dynamic input shapes) so my plan was to alter the size in the ONNX model (really just the first dimension) I import from. So correct me if I am wrong, but if the original input was (1,3,224,224) then if I wanted a batch size of say 32, I would resize the dimensions of the input layer to be (32,3,224,224), and I would set the max batch size by doing something like:
params.batchSize = 32; builder->setMaxBatchSize(mParams.batchSize).
But I am unsure how to feed in pointers for these multiple batches as I do above. Would I have to feed in 32 pointers for the input/output tensors? Or does it work that if batching, I would first have to put all the input data into a contiguous CUDA buffer of size 323224*224 and give the pointer to the first element to the execute method (as opposed to the executeV2 method).
TensorRT Version: 7.1
CUDA Version: 11.0
CUDNN Version: 8.0.2
Operating System + Version: Windows 10