Any sample on yolov3 python running tensorRT faster than 40ms per frame?

I have tried implementing the yolov3_onnx sample and managed to get it running at 42ms per frame which is as fast as yolov3 without tensorrt. Would someone be able to lead me to a tensorrt implementation for yolov3 in python that is much faster?

Perhaps some of the replies from would help

Thanks HengChenKim! Actually I am part of that discussion as well and I believe we both concluded that even when speeding up the bounding boxes, the inference time is still at 40ms per frame. I tried using the Alexey repo and managed to get my yolov3 running at 23ms per frame. I was looking for someone within the NVIDIA community using the sample code who has reached speeds that are similar.

LOL. My bad. Well, after I changed the sigmoid function, I ran it at int8 and managed to get faster inference times with not much compromise in accuracy. so perhaps can try that. Still not as fast 23ms per frame though

How did you get it to run in 40ms? That would already be a huuuuge improvement for me.

I get between 1700 & 2000+ms per inference with the sample at /usr/src/tensorrt/samples/python/yolov3_onnx

Are you running inference on CPU or GPU? You might have missed some steps in the setup as it should not be as high as 1000ms if you are running on a modern gpu.

I followed every steps in the readme of the sample.

tensorrt pre-installed on the latest jetpack, fresh install.