TensorRT 10.9: Batch Inference with enqueueV3 gives incorrect outputs for second batch

Hi NVIDIA team,

I am using TensorRT 10.9 on Ubuntu 24 in a C++ application to perform inference with a model converted from ONNX.
The model input is 512x512 images, and I want to run batch inference with batch size 2 using enqueueV3.

My workflow is as follows:

  1. I crop two regions from a full HD (1920x1080) image:
    • Crop1: (640,0,640,580) → resized to 512x512
    • Crop2: (0,0,1920,1080) → resized to 512x512
  2. I prepare a contiguous input buffer for the two crops in NCHW format.
  3. I call setInputShape() on the execution context with batch size 2 before each enqueueV3 call.
  4. I copy the input buffer to the device and call enqueueV3().
  5. I copy the output buffer back to host.

Problem:

  • The output coordinates for batch 1 and batch 2 are almost identical,
    even though the crops are different.
  • I have confirmed that the input buffer contains the correct data for both batches.
  • Zero-clearing the output buffer before inference does not change the result.

Questions:

  1. Is there something I might be missing when performing batch inference with enqueueV3 in TensorRT 10.9 C++?
  2. Could this be related to how fixed output buffers are used with dynamic/batch sizes?

Any guidance or examples would be greatly appreciated.

Thanks in advance!

Hi, following up on this thread —

I wanted to clarify that the model we are using does support dynamic batching.

Steps we used to confirm:

1️⃣ Check ONNX model in Netron

  • The output layer Identity:0 shows shape batch x 6, indicating that the batch dimension is variable.

  • This confirms the ONNX model is dynamic-batch ready.

2️⃣ ONNX → TensorRT conversion

  • We use the following script to generate the TRT engine.

  • The script explicitly sets --minShapes, --optShapes, and --maxShapes to allow dynamic inference for batch sizes 1–4.

#!/bin/bash
# ONNX -> TensorRT engine (dynamic batch)

ONNX_FILE=$1
if [ -z "$ONNX_FILE" ]; then
    echo "*** ERROR: Please specify an ONNX file."
    exit 1
fi

TRT_FILE="${ONNX_FILE%.onnx}.trt"
VENV_PATH="/opt/temp/tf_env"
echo "Activating virtual environment: $VENV_PATH/.venv"
source "${VENV_PATH}/.venv/bin/activate"

trtexec \
    --onnx="$ONNX_FILE" \
    --saveEngine="$TRT_FILE" \
    --minShapes=input:0:1x3x512x512 \
    --optShapes=input:0:2x3x512x512 \
    --maxShapes=input:0:4x3x512x512 \
    --fp16 \
    --verbose

if [ $? -eq 0 ]; then
    echo "$TRT_FILE has been created."
    echo "Dynamic inference supported for batch sizes 1–4."
else
    echo "*** ERROR: Failed to create TensorRT engine."
fi

deactivate

Key points:

  • ONNX verified dynamic batch support via Netron

  • TRT conversion script creates an engine capable of batch sizes 1–4

This should help confirm that the model and engine are dynamic-batch ready.

Bump.