INT8 throughput and latency worse than FP16 for MiDas DPT Hybrid model on Thor

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson AGX Thor
• JetPack Version (valid for Jetson only) JP7.0
• TensorRT Version 10.13

I am trying to get an estimate of how MiDas V3 DPTHybrid model would perform on Jetson AGX Thor. When I profiled using trtexec, I see the following:

  • MidasV3 FP16: 173 FPS, Mean latency: 5.8ms
  • MidasV3 INT8: 97 FPS, Mean latency: 10.3 ms
  • MidasV3 Best: 124.62 FPS, Mean latency: 8.09ms (generated with –best flag)

I am doing a PTQ using trtexec. I do not have a calibration set, trtexec does it with random inputs. Why am I seeing poorer performance with INT8 compared to FP16? Why is the engine generated with –best flag not match the performance of FP16?

Thank you for your time in advance.