Deepstream: create custom age gender model with TLT

martin1 · September 15, 2020, 10:01am

Below the generic information:

• Hardware Platform (Jetson / GPU): Jetson Nano
• DeepStream Version: Deepstream 5.0
• JetPack Version (valid for Jetson only): Jetpack 4.4
• TensorRT Version: compiled 7.0
• NVIDIA GPU Driver Version (valid for GPU only)

I kindly would ask about what is the best approach to create a custom model with TLT for age and gender detection?
A classification would not be sufficient as there could be multiple different persons on a frame and also I would like to combine this model with another model.

my questions would be:
-what is the best practice for this topic of gender age, when i would like to use TLT
-would it make sense to use a object detection model and train the face-images, where i would need to generate the kitti data OR would it be necessary to use the facedetect model and (somehow) enhance it with gender/age training

thanks for your help!
Martin

Morganh · September 16, 2020, 9:16am

It is necessary to train a classification model firstly. This model aims to differentiate different combination of age and gender. Before training, need to split images into different classes you want to train.
Then use peoplenet model as primary model in Deepstream, and config the classification model as secondary model.

martin1 · September 16, 2020, 6:09pm

Hello,
thanks for your feedback.
now for this use case, as i have age and gender as a combination, do I end up with one class on the detection or with 2 separate, which gets combined? because i want to get a result: e.g male:0-10.

I mean: for example shall I have the classes (folder structure):
class-A: male:0-10
class-B: female: 0-10
class-C: male: 11-20
…
or is it: class A: male (50% of dataset)
class B: female (remaining 50% of dataset)
class C: 0-10
…which then gets combined.

I hope you see my point :-)

thanks, Martin

Morganh · September 17, 2020, 6:27am

Suggest splitting your images into below folder.
class-A: male:0-10
class-B: female: 0-10
class-C: male: 11-20
…

martin1 · September 19, 2020, 8:35am

Hello,
having now created a model for age/gender based on classification and also generated the engine on Jetson, I have used the poeplenet deepstream model and added a secondary-gie0 into the config.
This works for another model which i have trained and loaded to Jetson e.g. detectnet, but when I replace the secondary-gie0 config from detectnet model to the generated classification model and when i start deepstream, the secondary classification model gets loaded but it is not somehow visible (I am running the deepstream with a webcam as input source)

–>What I would expect is, that peoplenet recognizes the face and with the face as input, the classification (age-gender) should recognize the related class and show it as bbox with the class-name.
–>But how to configure this?

below my log from jetson and configs:

log

jetson@nano01:~/tlt-racket/nano-jetson/tlt-configs$ deepstream-app -c my_deepstream_app_source1_peoplenet.txt
2020-09-19 10:14:22.966787: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

(deepstream-app:10238): GStreamer-CRITICAL **: 10:14:23.790: passed ‘0’ as denominator for `GstFraction’

(deepstream-app:10238): GStreamer-CRITICAL **: 10:14:23.790: passed ‘0’ as denominator for `GstFraction’
Warning: ‘input-dims’ parameter has been deprecated. Use ‘infer-dims’ instead.
Warning: ‘input-dims’ parameter has been deprecated. Use ‘infer-dims’ instead.

Using winsys: x11
0:00:06.902412681 10238 0x2685430 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 4]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1701> [UID = 4]: deserialized trt engine from :/home/jetson/tlt-racket/nano-jetson/tlt-engines/classification_age_gender.engine
INFO: [Implicit Engine Info]: layers num: 2
0 INPUT kFLOAT input_1 3x224x224
1 OUTPUT kFLOAT predictions/Softmax 37x1x1

0:00:06.902697167 10238 0x2685430 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 4]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1805> [UID = 4]: Use deserialized engine model: /home/jetson/tlt-racket/nano-jetson/tlt-engines/classification_age_gender.engine
0:00:07.090150585 10238 0x2685430 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<secondary_gie_0> [UID 4]: Load new model:/home/jetson/tlt-racket/nano-jetson/tlt-configs/my_config_infer_secondary_age_gender.txt sucessfully
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is OFF
gstnvtracker: Past frame output is OFF
0:00:07.610666230 10238 0x2685430 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1701> [UID = 1]: deserialized trt engine from :/home/jetson/tlt-racket/nano-jetson/tlt-models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input_1 3x544x960
1 OUTPUT kFLOAT output_bbox/BiasAdd 12x34x60
2 OUTPUT kFLOAT output_cov/Sigmoid 3x34x60

0:00:07.610824932 10238 0x2685430 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1805> [UID = 1]: Use deserialized engine model: /home/jetson/tlt-racket/nano-jetson/tlt-models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine
0:00:07.623889711 10238 0x2685430 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/home/jetson/tlt-racket/nano-jetson/tlt-configs/config_infer_primary_peoplenet.txt sucessfully

Runtime commands:
h: Print this help
q: Quit

p: Pause
r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
To go back to the tiled display, right-click anywhere on the window.

**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready

** INFO: <bus_callback:167>: Pipeline running

**PERF: 0.00 (0.00)
KLT Tracker Init
**PERF: 5.73 (4.03)
**PERF: 5.04 (4.67)
**PERF: 4.96 (4.80)
**PERF: 4.98 (4.86)
**PERF: 4.98 (4.89)
**PERF: 4.97 (4.91)
**PERF: 4.98 (4.92)
**PERF: 4.98 (4.93)
**PERF: 4.98 (4.94)
**PERF: 4.98 (4.95)
**PERF: 4.98 (4.95)
**PERF: 4.99 (4.95)
**PERF: 4.98 (4.96)
Quitting
App run successful

my_deepstream_app_source1_peoplenet

################################################################################

Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a

copy of this software and associated documentation files (the “Software”),

to deal in the Software without restriction, including without limitation

the rights to use, copy, modify, merge, publish, distribute, sublicense,

and/or sell copies of the Software, and to permit persons to whom the

Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in

all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.

################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1

[tiled-display]
enable=1
rows=1
columns=1
width=1920
height=1080
gpu-id=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=1
num-sources=1

#uri=file://…/…/streams/sample_1080p_h265.mp4
gpu-id=0
camera-width=1920
camera-height=1080
camera-fps-n=24
camera-v4l2-dev-node=0
[streammux]
gpu-id=0
batch-size=1
batched-push-timeout=40000

Set muxer output width and height

width=1920
height=1080

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0

[osd]
enable=1
gpu-id=0
border-width=3
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial

[primary-gie]
enable=1
gpu-id=0

Modify as necessary

model-engine-file=…/…/models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine

batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
gie-unique-id=1
config-file=config_infer_primary_peoplenet.txt

[sink1]
enable=0
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
codec=1
encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=2000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
output-file=out.mp4
source-id=0

[sink2]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0

set below properties in case of RTSPStreaming

rtsp-port=8554
udp-port=5400

[tracker]
enable=1
tracker-width=640
tracker-height=384
#ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_nvdcf.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so
#ll-config-file required for DCF/IOU only
ll-config-file=…/deepstream-app/tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=0
#enable-batch-process applicable to DCF only
enable-batch-process=1

[secondary-gie0]
enable=1
gpu-id=0
batch-size=4
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=my_config_infer_secondary_age_gender.txt

[tests]
file-loop=0

my_config_infer_secondary_age_gender

################################################################################

Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a

copy of this software and associated documentation files (the “Software”),

to deal in the Software without restriction, including without limitation

the rights to use, copy, modify, merge, publish, distribute, sublicense,

and/or sell copies of the Software, and to permit persons to whom the

Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in

all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.

################################################################################

[property]
gpu-id=0
net-scale-factor=1
offsets=124;117;104
tlt-model-key=tlt_encode
tlt-encoded-model=…/tlt-models/classification/export/age_gender_final_model.etlt
labelfile-path=…/tlt-labels/classification_age_gender.txt
model-engine-file=…/tlt-engines/classification_age_gender.engine
input-dims=3;224;224;0
uff-input-blob-name=input_1
batch-size=4
process-mode=2
model-color-format=0

network-mode=2
network-type=1
num-detected-classes=37
interval=0
gie-unique-id=1
output-blob-names=predictions/Softmax
classifier-threshold=0.2

Morganh · September 19, 2020, 3:45pm

As mentioned in Deepstream on Nano: running detectnet and classification at same time, please confirm you can run /opt/nvidia/deepstream/deepstream/samples/configs/tlt_pretrained_models/deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt successfully.
More, the secondary-gie0 or secondary-gie1 is a classification model

martin1 · September 19, 2020, 5:22pm

yes, I can run the example.
Also the Peoplenet example works fine. in my created deepstream config, i have added the secondary-gie0 with following settings:

[secondary-gie0]
enable=1
gpu-id=0
batch-size=4
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=my_config_infer_secondary_age_gender.txt

→ this gets successful loaded but without getting the result on the detection.

I think the “opeate-on-class-ids=0” has an effect, because the peoplenet-labels has [Person,Bag,Face] and i would need the face, which i think would be id=2 (if the count starts from 0). I will also create a new model only with 2 classes (gender) and give it a try.

Is there anything i need to check specifically when creating the classification model in TLT?
br, Martin

Morganh · September 19, 2020, 5:37pm

Please double check the config file for your age/gender model. You should confirm if this classification model can work well.
More tip, if possible, you can retrain the unpruned peoplenet model. Train with 1 class(only face).

martin1 · September 20, 2020, 4:08pm

Training the unpruned model: can you describe what needs to be done? Or do you have a link with details to train the peoplenet only for faces?

Morganh · September 21, 2020, 3:13am

Hi Martin1,
Suggest you using https://ngc.nvidia.com/catalog/models/nvidia:tlt_facedetectir directly.

martin1 · September 21, 2020, 10:11am

thanks. I got it working now.

a428tm · November 11, 2020, 5:56am

@martin1

Sounds like you are working on something fun and exciting.
I have been thinking about similar workflow - 2 step process (1. find an object 2. run text extraction).

When you said that you got it working, I was curious if I understood you correctly.
Did you combine 2 models into 1 by retraining?
Or did you end up using 1 model to find face. Then, another model to figure out the age group?

Would appreciate your feedback.

Thanks,
Jae

martin1 · November 12, 2020, 1:52pm

Hi Jae,
first, i tried with the peoplenet unpruned model and tried to train only with the face-class, but somehow that did not work. Then i went with a separate model running for the age/gender detection, but finally now, I went with the Azure API for the gender/age detection…
Br Martin

a428tm · November 13, 2020, 2:32pm

Appreciate you sharing the info!

Topic		Replies	Views
How to load existing caffe model (Gender) as a secondary classifier in deepstream DeepStream SDK tensorrt , cuda , ubuntu , python	20	1916	October 12, 2021
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	5809	October 12, 2021
FaceDetect IR Training using TLT 3.0 and Custom Dataset TAO Toolkit tensorrt , ai-training , deep-learning	13	1605	October 12, 2021
Integrating Tao Models (detectnet_v2) into Deepstream SDK TAO Toolkit tao , deepstream , jetson-nano	11	979	March 24, 2023
Adapting MQTT Configuration for Customized Deepstream Models: Need Assistance DeepStream SDK	12	405	July 16, 2024
Deepstream can’t create the .engine from the .etlt, using tlt3.0 a custom Mask-Rcnn model [2] TAO Toolkit	13	702	April 28, 2023
[Secondary GIE] Custom Classifier in sgie outputs only random entry in label.txt DeepStream SDK	30	2610	September 4, 2021
Some question about Deep stream 5 DeepStream SDK	42	1784	October 12, 2021
Custom new model segmentation in DeepStream DeepStream SDK tensorflow	41	1417	February 3, 2023
Classification model using TLT 3.0 TAO Toolkit ubuntu , ai-training , deep-learning	4	1060	October 12, 2021

Deepstream: create custom age gender model with TLT

Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a

copy of this software and associated documentation files (the “Software”),

to deal in the Software without restriction, including without limitation

the rights to use, copy, modify, merge, publish, distribute, sublicense,

and/or sell copies of the Software, and to permit persons to whom the

Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in

all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.

Set muxer output width and height

Modify as necessary

model-engine-file=…/…/models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine

set below properties in case of RTSPStreaming