How to visualise the 3d gaze vector output of the GazeNet model?

Hi Morganh,

Thanks for your reply.
I have some comments on each approach as below:

  1. Use “gazenet inference”.
    When looking at the inference-set/json_datafactory_v2/ p01_day03.json file, I see that the annotations are different for each file, i.e. for each image file, we need to know the location of faces, land marks, etc. So, I don’t think I can reuse them for a new image.

  2. Use deepstream-gaze-app
    The app can generate 80 facial landmarks, but the requires 104 landmarks. So, I guess they are not compatible.

  3. Use an old inference pipeline
    I am not clear how to process with this.

Did any approach above work for you?
Finally, my main goal is to visualise the gaze within the deepstream-gaze-app. Do you have any suggestion to do that directly in the deepstream-gaze-app?


Actually approach 3 is verified previously. The this example video from NVIDIA is running with it.
For, I will check if it works after reducing 104 to 80.

Just share a workaround for approach2 which is running inference with deepstream-gaze-app. Please follow the README in attached
It will dump the 80 facial landmarks points in order to visualize the cropped face along with the gaze vector. (372.6 KB)

Main change:

cp deepstream_gaze_app.cpp  bak_deepstream_gaze_app.cpp
cp deepstream_gaze_app_update.cpp  deepstream_gaze_app.cpp

make clean && make

./deepstream-gaze-app 1 ../../../configs/facial_tao/sample_faciallandmarks_config.txt file:///opt/nvidia/deepstream/deepstream-6.0/samples/configs/tao_pretrained_models/deepstream_tao_apps/apps/tao_others/deepstream-gaze-app/frame_816_571_0.png ./gazenet  |tee log.txt



Thanks Morganh. I will give it a try.


Hi Morganh,

I have changed the lines 419 and 423 to:

    "softargmax/strided_slice:0") == 0) {
    "softargmax/strided_slice_1:0") == 0) {

so that the pipeline could work with the TAO gazenet model on jetson nano.
I could visualise the gaze on some new images and the results did not look very accurate when a person was looking straight but the arrows pointed to different directions.
Looking at the script, I see that the script uses a 3D face model and the intrinsic camera parameters from a public training dataset. Are these camera parameters the same as those of the dataset used to train the TAO gazenet model. If they are different, is that the reason why the visualisation is not accurate? And where is the 3D face model from?


Firstly, may I know if you have trained a new model against your own dataset?

Hi Morganh,

No, I haven’t trained a new model. I use this gaze model from NVIDIA.


Thanks for the info. I am afraid the different data distribution between your own images and training images from gaze model may will result in inaccurate inference result.
If possible, could you try to run training against your own dataset?

To narrow down, with gaze model from NVIDIA, can you run “tao inference” against your own dataset to check if it can work? .
For json file, you can leverage the facial points dumped from the deepstream-app.

is this (the example photo) an accurate prediction for where you were looking ?

Hi @Morganh,

I am not sure why we need to run “tao inference” on new images as we have everything we need to visualise from the deepstream-app.
I cannot train the model on my new images as I don’t have labels for them. Besides, I also don’t want to train the model as I just want to use it on new images.
In addition, what does the gazenet model do if it cannot be used on new images that are not from the training dataset?


To run “tao inference” on a test image is in order to check whether it can work. This will help narrow down the issue. If it works, the gaze model has not problem. Then we need to find the gap between “tao inference” and deepstream-app.
I will run “tao inference” against your test image.

On your side, could you please resize your test image to 1280x720 and try again with deepstream-app?

Please run above-mentioned approach 3 for better result.
Refer to 3.21.08 doc Requirements and Installation — TAO Toolkit 3.0 documentation to download scripts via TAO Computer Vision Inference Pipeline | NVIDIA NGC or
ngc registry resource download-version "nvidia/tao/tao_cv_inference_pipeline_quick_start:v0.3-ga"

Setup server

$ cd tao_cv_inference_pipeline_quick_start_vv0.3-ga/scripts
$ bash 
$ bash

Open another terminal to run client.

$ export DISPLAY=:0
$ cd tao_cv_inference_pipeline_quick_start_vv0.3-ga/scripts
$ bash

Modify several lines in samples/tao_cv/demo_gaze/demo.conf
root@xx:/workspace/tao_cv-pkg# vim samples/tao_cv/demo_gaze/demo.conf     
            video_path=  /tmp/yourtest.mp4 
            fps = yourvideo_fps
            is_video_path_file= true
            resolution_whc= 640,480,3

root@xx:/workspace/tao_cv-pkg# ./samples/tao_cv/demo_gaze/gaze samples/tao_cv/demo_gaze/demo.conf

BTW, run “docker cp” to cp your testvideo.mp4 to the client.
$ docker cp yourtest.mp4 image_tao_cv_client:/tmp/

Hi @Morganh ,

Thanks for your reply. I will give it a try.
Do you know why the script did not work well on the data dumped from the deepstream app?


I will check further. Not sure which part brings the difference.
For quick solution for you, please use above inference approach.

Thanks @Morganh.
My main purpose is to be able to visualise the gaze from the deepstream gaze app as we are developing applications on Jetson Nano using deepstream. Please let me know if you figure out what the issue was.


For deepstream gaze app, it is still developed by deepstream team ongoing for this gaze visualization feature. It will not be available in short term. Maybe two or three months later.

BTW, you can also leverage the code in samples/tao_cv/demo_gaze/

root@xx:/workspace/tao_cv-pkg# ls samples/tao_cv/demo_gaze/
CMakeLists.txt  Demo.cpp  VizUtils.cpp  VizUtils.hpp  anthropometic_3D_landmarks.txt  demo.conf  gaze

Thanks @Morganh.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.