I am getting 6fps (for 608x608) in darknet at jetson Nano. After converting it to tensorrt, I am getting 800 ms as inference time and 110 ms (for 608x608) as post processing (i.e. Yolo layer nms operation) time. So, I can say that I am getting the same performance in both the cases. Can I reduce the post processing time? so that performance can be enhanced. I am using the sample tensorrt code (yolo to onnx and onnx to tensorrt).