run yolov3-tiny with tensorRT model


what is the way to run yolov3-tiny optimized with tesnorRT? i have translated the model to onnx then to tensorRT with help from this repo:

now what is the correct framework to run this model for video inference?

i know that currently deepstream support yolov3-tiny, but i want to be able to run tensortRT model without deepstream.



Have you tried our deepstream SDK?
It contains the samples for YOLOv2, YOLOv2_tiny, YOLOv3 and YOLOv3_tiny model.


I tested yolov3-tiny with deepstream and without it and there is no difference. Number of frames/sec are the same.
I suggest you to use:

Inside you can edit to your purpouse. I also suggest to use this model:

I’m able to reach 17-18 fps.


yes i have tried deepstream. but i am looking for a way to run dakrnet yolo-tiny model accelerated with tensorrt in python.

hi simone.rinaldi,

i was not able to reach this much fps using darknet on a 416416 yolo-tiny model, i had to lower the resolutions to 256256. how did it work for you?

actually i have tested deepstream with a yolo-tiny 416*416 model and it ran on 29 fps. but i dont want to use because i am facing some implications in running deepstream using rtsp url to my cameras. also i need to use my tracking algorithm which is written in python.

Me too… also my application is written in python and detect objects from IP cameras using RTSP.
About FPS please take a look to my post:

FPS shown on Nvidia application are related only to network time and not to complete application.

In my application if I get an RTSP stream (1080p 25fps) and detect it I’m able to reach a maximum of 11-13 fps, but consider that a big part of resources are used by opencv in order to draw boxes and to show image in a windows so my suggestion is to create a headless system.

thanks for your reference!

what do you mean by network speed? i have tested deepstream with a test video of 5 mins length and 25 fps and it finished in 4 mins and 5 seconds which i thought confirms the fps i see in the terminal. anyway i am currently looking at a tensorflow implementation with tensorRT optimization. you can check the article here:

i have also tested with headless mode. i got for my yolo-tiny 256*256 model including all overhead 17 fps. i want to test with 416 model, but i think i will get sth around 12, where i need a minimum of 15 fps.

Oh! Very interesting, I will try it!
About minimum FPS required, I have a suggestion for you: if you are not able to reach 15 Fps consider to skip frames that you are not able to manage.
In my python application I calculate (each second) how many frames I’m able to manage per second and application drops exceeding frames in order to be always synchronized with real time events.
In this way my application automatically increases or decreases number of frames managed adapting itself in relation to how I configure yolo.

So also if my video is 25 Fps but my application is able to manage only 12-13 fps, my object-detection application has no delay compared to realtime.

I modified TensorRT ‘yolov3_onnx’ sample and was getting ~14.2 FPS (yolov3-tiny-416) on Jetson Nano. (The FPS measurement included image acquisition and all of preprocessing/postprocessing.) Source code and a corresponding blog post have been shared online. I welcome feedbacks.