Yolov3's inference too heavy for Jetson Nano?

yasuda-toshihiro · December 16, 2019, 4:25am

I referred to the following article at Jetson Nano,
Certainly 20FPS is displayed on the console when deepstream-app (yolov3) is executed.

I have made changes specified the comment above in these config files and we can achieve a throughput of 20 FPS.

However, I was wondering why the video playback of this demo was not smooth
Following the example of osd_sink_pad_buffer_probe, the detected objects are enumerated every frame.

Then, by setting interval = 5, I noticed that inference was executed only once every 5 frames.
And without the interval, we were able to infer that all frames were inferred, but it was around 3 FPS.

You are writing “we can achieve a throughput of 20 FPS.”
It’s true. This is not to say that “20FPS yolov3 inference” is possible.

It is that? Is this recognition accurate?

The following comment in the article says “Its trade-off that needs to be tuned for your use-case.”
Do you mean that?

After all, is yolov3’s inference too heavy for Jetson Nano?
Is there any other way to speed up Jetson Nano’s Yolov3 inference?
(No tiny model is required)

Please make sure to change the height and width to 416 in yolov3.cfg before generating the engine file.
If the tracking results are bad for your test video, you can reduce the interval to improve the accuracy
further but the FPS will drop as well. Its trade-off that needs to be tuned for your use-case.

AastaLLL · December 17, 2019, 2:37am

Hi,

YOLOv3 is a complicated model for mobile system.
It has around 140.69 Bn FLOPS while the tiny version just has 5.56Bn.

The interval parameter decides how often the detector need to be applied on the stream.
As you already know, set interval=5 indicates do inference once every 5 frames.

After that, the bounding box is assigned to non-detected frame with our feature tracker.
We provide several tracker algorithm, like IOU, KLT, DCF, which make the bbox assignment more accurate.

This is kind of trade-off.
We know some of complicated model cannot reach real-time performance on the embedded system.
That’s why we develop serveral tracking algorithm to let them can be used in practical.

How long the detector should be applied depends on the moving speed of your object/scenario.
For example, you can set a larger interval value for a video conference usecase.

Our suggestion is to try our different tracking algorithm to see if it can meet your requirement or not.
Another possible improvement is to set the inference mode into fp16.

Thanks.

yasuda-toshihiro · December 17, 2019, 6:02am

Thank you AastaLLL. I’ve understood about this issue.

Topic		Replies	Views
Full Yolov3 on the nano using TensorRT or Deepstream 4.0.1 Jetson Nano	7	2535	October 14, 2021
Tiny Yolo v3 Frame Rate Jetson Nano cuda , yolo	2	2261	October 18, 2021
YOLOv3 TensorRT Inference Super Slow In Nano Jetson Nano	3	1094	October 14, 2021
Yolov3 in nanojetson Jetson Nano tensorrt	12	1102	October 18, 2021
Using deepstream with yolo models - performance on jetson nano? DeepStream SDK	3	952	October 12, 2021
yolo v3 runs slowly in jetson nano DeepStream SDK	4	1168	October 12, 2021
Yolov3 is very slow Jetson Nano	21	20391	October 14, 2021
Yolov3 image detector very slow Jetson Nano yolo	5	867	May 31, 2023
Want Real time Object Detection On Jetson Nano with Custom Trained Yolov3 weights! Jetson Nano	12	1310	October 14, 2021
custom yolov3-tiny on jetson nano with deepstream DeepStream SDK	3	525	October 12, 2021

Yolov3's inference too heavy for Jetson Nano?

Related topics