Wrong results when using triton dynamic batching



I set dynamic_batching in triton model configuration as mentioned in server/model_configuration.md at main · triton-inference-server/server · GitHub. With dynamic_batching, the throughput could increase by 40% but there was a big change that it would give wrong results. For the same input, the results of the triton without dynamic batching is constant. Any idea?


TensorRT Version: 7.1
GPU Type: 2080/1080/T4
Nvidia Driver Version: 450
CUDA Version: 11.0

Please try on latest Triton version, if you still face an issue, we recommend you to raise this query in Triton Inference Server · GitHub for better assistance.