Different performances

forflafor · February 13, 2022, 9:07am

Hello everyone.
For some time now I have been testing the performance of some models in the object detection activity.
I tested the models on different types of videos with different quality and different FPS numbers. What happens to me is that large models get the same number of FPS on all videos. On the other hand, if I use a small model, I get a lower FPS number on low quality videos and a higher FPS number on high quality videos.
Why does this happen?
I state that I use TensorRT as an optimizer.
Thank you.

AastaLLL · February 14, 2022, 6:11am

Hi,

Would you mind sharing more details about your comparison?
Do you use trtexec?
Which model, precision and data format do you use?

It will be good if you can share a sample and model to reproduce the issue directly.
Thanks.

forflafor · February 15, 2022, 10:01am

Thanks for the reply.
For my tests I use the SSD-Mobilenet-V1 model, which already exists within the repository, within DetectNet. The overall size of this model is around 30.7 MB.
Using the GetNetworkFPS () function, on several videos, I get a certain inference rate.
Now, if I went to modify the model (SSD-MobileNet-V1) with a simplified version, having dimensions equal to 3.5 MB, the number of FPS returned is obviously higher than the previous ones. What I can’t understand is why I get more FPS on high quality videos than on low quality videos.
I state that TensorRT is used on both models (i work with “.engine” extension).
I am attaching an image in which the comparison is shown. The red bars indicate the FPS number of the larger model, while the green bars indicate the number of FPS obtained on the simplified model.
Videos (V) and webcams (W) have the following resolutions:

-V1: 240p_60fps.mp4,
-V2: 360p_30fps.mp4,
-V3: 480p_30fps.mp4,
-V4: 720p_30fps.mp4,
-V5: 1080p_30fps.mp4,
-V6: 1080p_60fps.mp4,
-W1: 720p_60 fps,
-W2: 1080p_30 fps

As for precision, I think it’s FP16.
I hope I have been detailed.

Thank you.

AastaLLL · February 22, 2022, 7:24am

Hi,

In general, the network input size is fixed on training time. (ex. 300x300 for SSD-Mobilenet-V1)
For each input resolution, the pre-processing will first rescale it to network input size for inference.
So the performance of a model usually depends on the size of the network mainly.

Based on your image, we expect the performance like the red one (large model)
Do you find any difference in the detected objects among the videos?
It’s possible that the performance is affected by the rendering steps.

Thanks.

forflafor · February 22, 2022, 7:34am

Thanks for the reply.
The results I showed you are obtained without using the following commands:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

After executing these commands, the results obtained are totally different. On each video I get an increase of about 280% compared to the number of fps obtained on the large model.
How come I get a similar result on different quality videos?

Thank you.

AastaLLL · March 10, 2022, 7:34am

Hi,

Thanks for the update.

That’s because the network size is fixed. (ex. 300x300)
So for the different input videos, the first step is to downscale the resolution into the network size.
(ex. 1080p → 300x300, 720p → 300x300)

Since the main computational part comes from the inference.
The executing time will be very similar on the same input network resolution.

Thanks.

system · March 30, 2022, 5:48am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.