TensorRT EfficientNMS plugin FP16 inconsistent(but valid) results



I encountered an issue with the EfficientNMS plugin and its FP16 mode and I’m wondering whether this can be treated as a bug or it is somehow expected/can be explained.

Running the inference multiple times on the same image in the float16 mode very often results in a slightly different output, one or few pixels off. Example:
1st run [104 x 75 from (142, 156)] vs 2nd run [105 x 73 from (141, 158)] with same confidence score.
Also sometimes the order of sorted bounding boxes with the same score is different than in the previous iteration.

I consider the detections valid as they are pretty close. This is a problem for my testing plan as I assumed that when re-using the same TensorRT plan file I would always get deterministic results which has been the case until now.
Are there any ideas how to explain this behavior?
If it’s a bug I’ll try to prepare reproduction using publicly available models/data.


  • issue occurs re-using the same TRT plan
  • using FP32 - the issue does NOT occur
  • using FP16 with plugins around EfficientNMS forced to FP32 - the issue does NOT occur
  • confirmed that the NMS plugin gets the same data in each iteration and sometimes outputs slightly different (dumped and compared raw bytes)
  • not all the images are “problematic”, mostly those with many detections

Thank you!


TensorRT Version: 8.4 - ( previous versions don’t work because of other issue )
GPU Type: RTX2070
Nvidia Driver Version: 525.60.13
CUDA Version: 11.8
CUDNN Version: 8.6.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): baremetal

Relevant Files

Steps To Reproduce


We are checking on this issue internally.
Could you please share with us minimal issue repro model/script for better debugging.

Thank you.

Thank you for the reply.
I will try to create some minimum setup to reproduce this issue based on publicly available data. Please allow me some time as I cannot share our projects code/models here.

1 Like


I created a small app to show the problem. What it does:

  • creates EfficientNMS plugin
  • configures the plugin based on the provided sample input
  • runs the plugins enqueue() using sample data once and saves the output for further comparison
  • runs the enqueue() next 10 times and compares the outputs.
    If it’s different than the 1st run it dumps the buffer content into file.

In my case I get different results almost each run for the boxes output.


repro_fp16_nms_issue.tar.gz (613.8 KB)

@spolisetty Hi! Did you have a chance to check the issue internally? Thanks