TensorRT and Triton Server - different results each time

Description

We’re running a YOLOv5s model converted to a TensorRT engine in a Triton Server. Sending the same image to the Triton server for inference returns different results each time. Is this normal or is there something we could do to make it deterministic? We are using a Jetson Nano 2GB and built the TensorRT engine on the Jetson Nano itself.

Environment

TensorRT Version: 7.1
Operating System + Version: Jetson Nano 2GB / Tegra

Steps To Reproduce

Run the inference with the same image using the same TensorRT engine running on a Triton Server multiple times and the results are different each time.

Hi,
We recommend you to raise this query in TRITON forum for better assistance.

Thanks!