TLT different results inference

I just try to train a FasterRCNN (backbone: efficientnet_b1) tlt model with public KITTI dataset.
Train only for 3 epochs. And export the tlt model to etlt model. Then generate the trt_fp16 engine.
But I cannot see much difference between inference with tlt model and inference with trt_fp16 engine.
Can you double check or try KITTI dataset?
More, you can also try to set a lower bbox_visualize_threshold when run inference with trt_fp16 engine.