I encountered an issue with the EfficientNMS plugin and its FP16 mode and I’m wondering whether this can be treated as a bug or it is somehow expected/can be explained.
Running the inference multiple times on the same image in the float16 mode very often results in a slightly different output, one or few pixels off. Example:
1st run [104 x 75 from (142, 156)] vs 2nd run [105 x 73 from (141, 158)] with same confidence score.
Also sometimes the order of sorted bounding boxes with the same score is different than in the previous iteration.
I consider the detections valid as they are pretty close. This is a problem for my testing plan as I assumed that when re-using the same TensorRT plan file I would always get deterministic results which has been the case until now.
Are there any ideas how to explain this behavior?
If it’s a bug I’ll try to prepare reproduction using publicly available models/data.
- issue occurs re-using the same TRT plan
- using FP32 - the issue does NOT occur
- using FP16 with plugins around EfficientNMS forced to FP32 - the issue does NOT occur
- confirmed that the NMS plugin gets the same data in each iteration and sometimes outputs slightly different (dumped and compared raw bytes)
- not all the images are “problematic”, mostly those with many detections
TensorRT Version: 8.4 - 126.96.36.199 ( previous versions don’t work because of other issue )
GPU Type: RTX2070
Nvidia Driver Version: 525.60.13
CUDA Version: 11.8
CUDNN Version: 8.6.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): baremetal