Then run inference on 32 input tiles gives expected result.
context->execute(32, &internalConfig->buffers[0]);
Then run inference on 31 input tiles gives expected result.
context->execute(31, &internalConfig->buffers[0]);
Then back to 32 input tiles gives the wrong result for input 32
context->execute(32, &internalConfig->buffers[0]);
I can provide code if required but initially just asking if this was a known issue.
If I always run inference on the whole batch size then it’s all fine.
Just seems a waste when I don’t have enough tiles for a whole batch.
Environment
TensorRT Version: 7.1.3.0 GPU Type: Jetson NX Nvidia Driver Version: CUDA Version: 10.2.89 CUDNN Version: 8.0.0.180 Operating System + Version: Jetpack 4.4.1 Python Version (if applicable): TensorFlow Version (if applicable): 1.15 PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation
Also, request you to share your model and script if not shared already so that we can help you better.
Meanwhile, for some common errors and queries please refer to below link: