I’m using the sample code for converting yolov3 for use in TensorRT. Sample code documentation can be found at https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#yolov3_onnx
Realised that the post-processing step is extremely slow (up to a second for 1 frame). The cause of the slowdown is
# E.g. in YOLOv3-608, there are three output tensors, which we associate with their # respective masks. Then we iterate through all output-mask pairs and generate candidates # for bounding boxes, their corresponding category predictions and their confidences: boxes, categories, confidences = list(), list(), list() for output, mask in zip(outputs_reshaped, self.masks): box, category, confidence = self._process_feats(output, mask) box, category, confidence = self._filter_boxes(box, category, confidence) boxes.append(box) categories.append(category) confidences.append(confidence)
which can be found in
Any suggestions on how to improve the post-processing speed?