I’m using TensorRT 7.1.3.4 and I’ve observed the same issue with darknet (YOLOv4) → ONNX → TensorRT. In the SPP module, 4 tensors from previous layers are concat’ed together. The incorrect computation of INT8 “concat” results in very bad detection outputs.
If I use the same code to convert a YOLOv3 model to TensorRT INT8, the result is good. mAP of the INT8 engine is less than 1% different from the FP16 and FP32 engines. YOLOv3 does not have the SPP (concat) module.
Input tensors to the “concat” usually have different dynamic ranges. They could not be concatenated directly as INT8 values. I guess current TensorRT release does not handle that correctly.
@AakankshaS, could you help to check again? Thanks.