Multiple DeepStream TAO applications

Is it possible to achieve stable operation below the enumerated detectors (in the future, it is planned to use a gesture detector and body position = skeletonization)?

  1. gaze;
  2. emotions.

Yes, these detectors work stably (no crashes, compared to “TAO CV Pipeline l4t” - NVIDIA NGC), but, judging by the DeepStream TAO rendering (deepstream_tao_apps/apps/tao_others at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub ) on the video, the frames denoting the eyes sometimes “creep” = appear on the eyebrows, for example, either “creep” onto the background.

p.s. In parallel, I would like to clarify a number of IMPORTANT questions - is it possible to solve the problem that:

  1. the gaze / emotion detector detects a “parasitic” face inside the main face, or does it find a “parasitic” face somewhere in the background?
  2. Are the frames denoting the eyes “floating”?

p.s. p.s. Unlike eye trackers that use IR illumination, does the color of the iris matter (for example, against the background of a brown iris, the pupil is difficult to distinguish)?


professorx@x-mansion:~/ssd/jetsonUtilities$ python3
NVIDIA Jetson Xavier NX (Developer Kit Version)
L4T 32.6.1 [ JetPack 4.6 ]
Ubuntu 18.04.6 LTS
Kernel Version: 4.9.253-tegra
CUDA 10.2.300
CUDA Architecture: 7.2
OpenCV version: 4.1.1
OpenCV Cuda: NO
Vision Works:
VPI: ii libnvvpi1 1.1.15 arm64 NVIDIA Vision Programming Interface library
Vulcan: 1.2.70

Addition to the question from the topic header…
In the TAO Toolkit Computer Vision Inference Pipeline API, this function was used to determine the direction of looking at the camera:

bool isLookingAtCamera (const njv :: Gaze & gaze)
    static constexpr uint32_t MAGNITUDE = 100;

    // return instantaneous evaluation of gaze within circular region in xy
    // around the camera
    return std :: sqrt (pow (gaze.x, 2) + pow (gaze.y, 2)) <MAGNITUDE;

The incoming data looked like a structure:

struct Gaze
    float x;
    float y;
    float z;
    float theta;
    float phi;

I tried to use this same function (changing the arguments to a simple float (x, y, z, theta, phi) array) at: deepstream_tao_apps/apps/tao_others/deepstream-gaze-app at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub

bool isLookingAtCamera (float * gaze, int size)
    static constexpr uint32_t MAGNITUDE = 100;

    if (size! = 5)
        return false;
    // return instantaneous evaluation of gaze within circular region in xy
    // around the camera
    return std :: sqrt (pow (gaze [0], 2) + pow (gaze [1], 2)) <MAGNITUDE;

But, unfortunately, the determination of the gaze direction practically does not work, even on the “ideal” demo video from the gaze detector this function works unstably ( Gaze Demo for Jetson/L4T | NVIDIA NGC )

I would like to learn more about the gaze parameters: x, y, z, theta, phi?
In what units is the gaze detector output measured?
In the attachment there are two output video gaze detectors (magnitudes 100 and 200).

raw output from DeepStream

May I know if it is 100% reproduced with default apps in the github? If not, could you please share a test video?

The same as above, may I know how can I reproduce with default app?

Refer to What are theta and phi in the gazenet model outputs? - #3 by Morganh
It is gaze vector.
theta (float): Gaze pitch in radians.
phi (float): Gaze yaw in radians.

Happy New Year and Merry Christmas.

I changed the original code of the gaze detector minimally:

  1. porting from TAO CV Pipeline a function for determining the direction of gaze into the camera (setting the color of the frames around the eyes depending on the direction of gaze into the camera);
  2. adding debug output to the console in several chunks of code.

I’ve been working all this time with a clone of the repository on Dec 8, 2021.

git info
root @ x-mansion: ~ / ssd / DeepStream / deepstream_tao_apps # git branch
* release / tao3.0_ds6.0ga
root @ x-mansion: ~ / ssd / DeepStream / deepstream_tao_apps # git branch -r
  origin / HEAD -> origin / master
  origin / master
  origin / release / tao3.0
  origin / release / tao3.0_ds6.0ga
  origin / release / tlt2.0
  origin / release / tlt2.0.1
  origin / release / tlt3.0
root @ x-mansion: ~ / ssd / DeepStream / deepstream_tao_apps # git show --summary
commit a0665e8909ebefc2924c57e31f5602ca4c31f6ca (HEAD -> release / tao3.0_ds6.0ga, origin / release / tao3.0_ds6.0ga)
Author: Fei Chen <>
Date: Wed Dec 8 21:28:53 2021 +0800

    Fix README error

root @ x-mansion: ~ / ssd / DeepStream / deepstream_tao_apps #

I manually pulled the original gaze detector file (deepstream_gaze_app.cpp) from the “master” branch, after building the application, it seemed like the “spreading” of the eyes decreased a little, but the “parasitic” faces seemed to appear a little more (BUT, in the gaze detector code I did not made global changes o_O).

X, Y, Z coordinates relative to which point are they calculated?

p.s. The attached video files contain the original file and the result (of the original gaze detector code):


WIN_20211224_19_13_19_Pro_gaze_raw-data.264 (9.4 MB)

Sorry for late reply. According to your shared h264 result, I can find the frame containing “parasitic” face.

Could you change the face threshold in deepstream_tao_apps/config_infer_primary_facenet.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub and retry?

I also apologize for the late reply.
I changed my smartphone, and, for some reason, I did not receive notifications from the forum until I turned on the old smartphone…
All Hail Megatr… Google…

You suggest editing the thresholds in the parameter group:



Yes, try to set larger pre-cluster-threshold and retry firstly.

I increased the threshold value, false faces do not appear, also on video, where there are many foreign objects in the background.


Source videos and results (*.264) in the attached archive. (32.7 MB)

Thanks for the update. The inference result looks ok.