mAP loss in FP16 mode

Windows 10
I got a mAP loss in FP16 mode from a FP32 model when I impletmented a pruned yolov3 onnx model to trt model.
I convert the onnx model to FP16 in the following c++ code.

IBuilder* builder = createInferBuilder(gLogger);
	nvinfer1::INetworkDefinition* network = builder->createNetwork();
	auto parser = nvonnxparser::createParser(*network, gLogger);
	builder->setMaxWorkspaceSize(1 << 20);
	ICudaEngine* engine = builder->buildCudaEngine(*network);
	trtModelStream = engine->serialize();

It works pretty well in FP32.However,when I convert it to FP16,some objects can not be detected in the the bottom half of the picture while it works well in the top half of the picture.It seems that the model converge in the top half of the picture but not converge in the bottom half of the picture.And I also confuse that I got FP16 model with different inference results when I convert the same FP32 onnx model different times.Is there any stochastic process in the trt model converting process? Or Is there any other command I can do with FP16 except let ‘builder->setFp16Mode(true)’.
I also try a lot about model with int8 converting.But it seems that there is not so much advantage in FPS but it has much more mAP loss compared with FP16 in RTX20 series.

Can someone help me?

Can you provide the following information so we can better help?
Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow version
o TensorRT version
o If Jetson, OS, hw versions

Also, if possible please share the script and model file to reproduce the issue.