The effect is very poor when converted to trt

For fp16 poorer than FP32, please use TRT8.6.1. Then the issue is gone in fp16.
Step to use TRT8.6.1 under 5.0.0 pyt docker:

 $ wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
 $ tar zxvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
 $ pip install TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp38-none-linux_x86_64.whl
 $ export LD_LIBRARY_PATH=/home/morganh/demo_3.0/public_data/notebook/pointpillars_20230816/TensorRT-8.6.1.6/lib:$LD_LIBRARY_PATH
 $ pointpillars export -e pointpillars.yaml -k nvidia_tlt --save_engine fp16_trt8.6.engine -t fp16
 $ pointpillars inference -e pointpillars.yaml -r result_infer_export_engine_8.6.1 -k nvidia_tlt --trt_engine fp16_trt8.6.engine