Deepstream Gaze App not functioning on 640x480 video

I am currently using deepstream’s gaze estimation application from deepstream_tao_apps:

I have an issue with small format videos, specifically when I try to run the app on a 640x480 video, the gaze direction estimation is false and it is not stable, it oscillates a lot.

In order to use a small size video of 640x480 and perform the inference on it without rescaling it, I modified lines 65 and 66 in deepstream_gaze_app.cpp in the above repo:

#define MUXER_OUTPUT_WIDTH 640
#define MUXER_OUTPUT_HEIGHT 480

The problem is not present when I run the app with a bigger shape format, on the same 640x480 video:

#define MUXER_OUTPUT_WIDTH 1280
#define MUXER_OUTPUT_HEIGHT 720

I also modified lines 958 and 959 to display the video with the original shape:

  g_object_set (G_OBJECT (nvtile), "rows", tiler_rows, "columns",
      tiler_columns, "width", 640, "height", 480, NULL);

Here is the script that I modified:
deepstream_gaze_app_modified.zip (10.6 KB)

I noticed that the issue is not with video quality, it is more related to shape format size; the bigger the image size, the better it performs. Even on the same 640x480 video, if we rescale it to 1280x960 (double), the model functions much better.

There is also another potential problem, the facial landmarks are not showing with the rescaled video, but with the original one, they are displayed.

In my application, I need to use an image stream with a small shape format. Is there a way to make this work?

Here are some details about my setup:

• Hardware Platform: Jetson Orin NX
• DeepStream Version: deepstream-6.2
• JetPack Version: 5.1.1 - Jetson Linux 35.3.1
• TensorRT Version: tensorrt-8.5.2.2
• Gstreamer Version: 1.16.3

This first example is with the 640x480 video without rescaling:
640x480

This second example is with the same 640x480 video but rescaled to 1280x720:
1280x720

1.I tested it with two videos, one original and the other scaled to 640x480 using ffmpeg -i input.mp4 -s 640x480 input2.mp4 and observed the same results.

2.I modified the parameters of streamux in gazenet_app_config.yml to 1280x720, and at the same time modified the parameters of nvmultistreamtiler to 640x480 like you did. The facial landmarks are displayed normally

  g_object_set (G_OBJECT (nvtile), "rows", tiler_rows, "columns",
      tiler_columns, "width", 640, "height", 480, NULL);

3.I am using DS-7.0 and JP-6.0. Run as follows

./deepstream-gaze-app ./gazenet_app_config.yml 

Could you try using DS-7.0 or share your test video?

Hello, thanks for your reply !
A couple questions about what you did:

  1. When testing the two videos, did you set the variables MUXER_OUTPUT_WIDTH and MUXER_OUTPUT_HEIGHT to 640 and 480 respectively in deepstream_gaze_app.cpp? If the values were still set to 1280 and 720, the problem does not appear, no matter the video size, because the streamux will scale the video back to 1280 by 720 (the original variable values). So if you could please confirm that these variables were properly changed to 640 by 480. Or instead you could just replace the script by the one that I attached, rebuild and run.

  2. Did you use the 640x480 video to check if facial landmarks are displayed?

Thank you in advance !

It’s same as modify the gazenet_app_config.yml configuration file.
In fact, I also tried to modify the code, the result is similar.

Yes, it is displayed normally, the difference is that I use DS-7.0

Okay thank you !
I noticed one more thing; downward gaze estimation does not work very well.
To check if this is a problem related to the deepstream version, do you mind running the model on the following video and sharing your results please?
recording.zip (1.7 MB)

This is a 1280x720 video, run on the original script with nothing modified.
Here is my result:
gazenet

I will try it.

Some other tips

For this problem, you can try to modify the value of radius.

disp_meta->circle_params[disp_meta->num_circles].radius = 1;

Although I can get it to work fine on DS-7.0, but it oscillates a lot too.

This is due to the accuracy of the model. To achieve higher accuracy, the model needs to be retrained.

input video 640x480
nvstreammux 640x480
tiler 640x480

input video 1280x720
nvstreammux 1280x720
tiler 1280x720

Alright, It seems we have similar results.

This implies that my deepstream version is probably not what’s causing the very bad results with small resolution videos.

What do you believe to be the source of this problem?

We should be able to get the same results with 640x480 and 1280x720.

I think this is caused by the accuracy of the model. You need to retrain the model for your dataset.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.