Hi NVIDIA team,
I am using TensorRT 10.9 on Ubuntu 24 in a C++ application to perform inference with a model converted from ONNX.
The model input is 512x512 images, and I want to run batch inference with batch size 2 using enqueueV3.
My workflow is as follows:
- I crop two regions from a full HD (1920x1080) image:
- Crop1: (640,0,640,580) → resized to 512x512
- Crop2: (0,0,1920,1080) → resized to 512x512
- I prepare a contiguous input buffer for the two crops in NCHW format.
- I call setInputShape() on the execution context with batch size 2 before each enqueueV3 call.
- I copy the input buffer to the device and call enqueueV3().
- I copy the output buffer back to host.
Problem:
- The output coordinates for batch 1 and batch 2 are almost identical,
even though the crops are different.
- I have confirmed that the input buffer contains the correct data for both batches.
- Zero-clearing the output buffer before inference does not change the result.
Questions:
- Is there something I might be missing when performing batch inference with enqueueV3 in TensorRT 10.9 C++?
- Could this be related to how fixed output buffers are used with dynamic/batch sizes?
Any guidance or examples would be greatly appreciated.
Thanks in advance!
Hi, following up on this thread —
I wanted to clarify that the model we are using does support dynamic batching.
Steps we used to confirm:
1️⃣ Check ONNX model in Netron
-
The output layer Identity:0 shows shape batch x 6, indicating that the batch dimension is variable.
-
This confirms the ONNX model is dynamic-batch ready.
2️⃣ ONNX → TensorRT conversion
-
We use the following script to generate the TRT engine.
-
The script explicitly sets --minShapes, --optShapes, and --maxShapes to allow dynamic inference for batch sizes 1–4.
#!/bin/bash
# ONNX -> TensorRT engine (dynamic batch)
ONNX_FILE=$1
if [ -z "$ONNX_FILE" ]; then
echo "*** ERROR: Please specify an ONNX file."
exit 1
fi
TRT_FILE="${ONNX_FILE%.onnx}.trt"
VENV_PATH="/opt/temp/tf_env"
echo "Activating virtual environment: $VENV_PATH/.venv"
source "${VENV_PATH}/.venv/bin/activate"
trtexec \
--onnx="$ONNX_FILE" \
--saveEngine="$TRT_FILE" \
--minShapes=input:0:1x3x512x512 \
--optShapes=input:0:2x3x512x512 \
--maxShapes=input:0:4x3x512x512 \
--fp16 \
--verbose
if [ $? -eq 0 ]; then
echo "$TRT_FILE has been created."
echo "Dynamic inference supported for batch sizes 1–4."
else
echo "*** ERROR: Failed to create TensorRT engine."
fi
deactivate
✅ Key points:
This should help confirm that the model and engine are dynamic-batch ready.