I have a theoretical question concerning the management of batch sizes in tensorRT.
In imported neural network models, should the first dimension of shapes for each layer of the neural network be dedicated to batch size? and, does TensorRT interpret it?
I ask myself this question because I would like to speed up a model initially created on tensorflow. I exported it in onnx format with tf2onnx.
In tensorflow, for each layer, the first dimension of the shape corresponds to the batch size.
This extra dimension is set to None to tell tensorflow that the batch size is not fixed.
Because the first dimension of the shapes is undefined, TensorRT considers the model with “dynamic shapes”.
However, I do not have the use of it: I would like to create my network with implicit batch size to be able to specify a bacth size at the time of the execute / inqueue function. (You can’t have a implicit batch if you have dynamic shapes.)
** TensorRT Version **: 7.2
** CUDA Version **: 11.2
** TensorFlow Version (if applicable) **: 2.2
NB: I also noticed that in the onnx format, only the input and the output have dim_param at “unknow” or -1 whereas in tensorflow, all the layers have a first dimension of shape set to None.