FaceDetect IR Training using TLT 3.0 and Custom Dataset

rohitnairkp · April 16, 2021, 9:13am

Hi ,

Hardware - DGPU

GPU - Tesla T4

I am trying to train a new model for face detection using my own custom dataset. While surfing around Nvidia Developer website I found the Getting started with TLT 3.0 Documentation for custom model training with jupyter notebook and related instructions ➟➟➟ https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/quickstart/deepstream_integration.html

Before starting with the instructions I installed all the prerequisites like Docker, Nvidia Docker Container and Jupyter notebook prerequisites,
➟➟➟ https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#platform-requirements
➟➟➟ https://www.thegeekdiary.com/run-docker-as-a-non-root-user/
➟➟➟ Container Toolkit - https://psnow.ext.hpe.com/doc/a00094832enw
➟➟➟ Virtual Environment - https://medium.com/@aaditya.chhabra/virtualenv-with-virtualenvwrapper-on-ubuntu-34850ab9e765

This is the TLT Jupyter Notebook instructions I followed to train the model using the Wider Face Dataset as per instructions ➟➟➟ GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC | NVIDIA NGC

All the steps where completed as mentioned below, ✔️

Set up env variables, map drives and install dependencies
Prepare dataset and pre-trained model
Provide training specification
Run TLT training
Evaluate the trained model
Prune the trained model
Retrain the pruned model
Evaluate the retrained model
Visualize inferences
Deploy
Verify Deployed Model

These are the folders created as outputs in the tlt-experiments directory,

According to the documentation the final trained model is saved in experiment_dir_final directory and inside the directory there are two files,

resnet18_detector.etlt
resnet18_detector.trt

Queries

How can I use this trained model to test a live video cam feed or input video file so that the model can detect faces. ?
This is the current file with the pretrained model that is running the live cam feed /opt/nvidia/deepstream/deepstream-.1/samples/configs/tlt_pretrained_models/deepstream_app_source1_facedetectir.txt
I run this file using the command deepstream-app -c deepstream_app_source1_facedetectir.txt
Is there any way i can generate a int8.txt for the trained model as the pretrained model file has one file along when I downloaded from ngc.
How to convert and generate the engine file?
How to use custom dataset ? Any guide for labelling and preparing datasets

Appreciate if anyone guides me through.

Thanks!

Morganh · April 16, 2021, 10:13am

Please refer to DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation End user can run inference via deepstream.
Yes, you can refer to them. The files include config_infer_primary_facedetectir.txt, deepstream_app_source1_facedetectir.txt and labels_facedetectir.txt.
Refer to DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation and DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation. The example can be found at tlt_cv_samples_v1.0.2/detectnet_v2/detectnet_v2.ipynb too.
See DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation. The example can be found at section 10.B of tlt_cv_samples_v1.0.2/detectnet_v2/detectnet_v2.ipynb too.
Refer to section 2 of tlt_cv_samples_v1.0.2/detectnet_v2/detectnet_v2.ipynb or tlt_cv_samples_v1.0.2/facenet/facenet.ipynb

rohitnairkp · April 17, 2021, 6:46am

Hi @Morganh,

Im not able to get the proper way to genrate the int8 txt file as there is no instructions on how to get it exported in facenet.ipynb

This is the files in the pretrained model I download from NGC , tested and worked perfectly with videofiles, IP cams and webcam streams.

I want to generate the similar files for the custom trained models which I am unable to create.

Can you please guide me on the how to create this steps for facenet and the detectnet_v2.ipynb was far advanced to my undersatnding as I am new to deepstream.

Morganh · April 17, 2021, 3:32pm

Facenet is based on detectNet_v2 network. So, please refer to section 10.A of tlt_cv_samples_v1.0.2/detectnet_v2/detectnet_v2.ipynb

DetectNet_v2 model supports int8 inference mode in TensorRT. In order to use int8 mode, we must calibrate the model to run 8-bit inferences -

Generate calibration tensorfile from the training data using detectnet_v2 calibration_tensorfile

Use tlt-export to generate int8 calibration table.

!tlt detectnet_v2 calibration_tensorfile -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt
-m 10
-o $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor

!rm -rf $LOCAL_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
!rm -rf $LOCAL_EXPERIMENT_DIR/experiment_dir_final/calibration.bin
!tlt detectnet_v2 export
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
-k $KEY
–cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor
–data_type int8
–batches 10
–batch_size 4
–max_batch_size 4
–engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8
–cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin
–verbose

rohitnairkp · April 19, 2021, 3:39am

Hi @Morganh,

These are the files that are generated after running the commands mentioned above,

And the last file resnet18_detector_trt.int8 shows 0bytes.

How to run the model as deepstream-app -c deepstream_app_source1_facedetectir.txt

Morganh · April 19, 2021, 6:34am

Please share your full command and full log.

rohitnairkp · April 19, 2021, 7:08am

Hi @Morganh,

The below mentioned are the files present in tlt_pretrained_models that I have earlier used for running the FacDetectIR pretrained model.

glueck@gluecktx2DS5:/opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models$ cat config_infer_primary_facedetectir.txt

    ################################################################################
    # Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
    #
    # Permission is hereby granted, free of charge, to any person obtaining a
    # copy of this software and associated documentation files (the "Software"),
    # to deal in the Software without restriction, including without limitation
    # the rights to use, copy, modify, merge, publish, distribute, sublicense,
    # and/or sell copies of the Software, and to permit persons to whom the
    # Software is furnished to do so, subject to the following conditions:
    #
    # The above copyright notice and this permission notice shall be included in
    # all copies or substantial portions of the Software.
    #
    # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
    # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
    # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
    # DEALINGS IN THE SOFTWARE.
    ################################################################################

    [property]
    gpu-id=0
    net-scale-factor=0.0039215697906911373
    tlt-model-key=tlt_encode
    tlt-encoded-model=../../models/tlt_pretrained_models/facedetectir/resnet18_facedetectir_pruned.etlt
    labelfile-path=labels_facedetectir.txt
    int8-calib-file=../../models/tlt_pretrained_models/facedetectir/facedetectir_int8.txt
    model-engine-file=../../models/tlt_pretrained_models/facedetectir/resnet18_facedetectir_pruned.etlt_b1_gpu0_int8.engine
    input-dims=3;240;384;0
    uff-input-blob-name=input_1
    batch-size=1
    process-mode=1
    model-color-format=0
    ## 0=FP32, 1=INT8, 2=FP16 mode
    network-mode=1
    num-detected-classes=1
    interval=0
    gie-unique-id=1
    output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

    [class-attrs-all]
    pre-cluster-threshold=0.2
    group-threshold=1
    ## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
    eps=0.2
    #minBoxes=3

glueck@gluecktx2DS5:/opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models$ cat deepstream_app_source1_facedetectir.txt

################################################################################
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=2
num-sources=1
#uri=file://../../streams/sample_1080p_h265.mp4
uri=rtsp://root:Glueck321@10.0.1.36/axis-media/media.amp?streamprofile=H264
gpu-id=0

[streammux]
gpu-id=0
batch-size=1
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0

[osd]
enable=1
gpu-id=0
border-width=3
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial

[primary-gie]
enable=1
gpu-id=0
# Modify as necessary
model-engine-file=../../models/tlt_pretrained_models/facedetectir/resnet18_facedetectir_pruned.etlt_b1_gpu0_int8.engine
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
gie-unique-id=1
config-file=config_infer_primary_facedetectir.txt

[sink1]
enable=0
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=2000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
output-file=out.mp4
source-id=0

[sink2]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400

[tracker]
enable=1
tracker-width=640
tracker-height=384
#ll-lib-file=/opt/nvidia/deepstream/deepstream-5.1/lib/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream-5.1/lib/libnvds_nvdcf.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.1/lib/libnvds_mot_klt.so
#ll-config-file required for DCF/IOU only
ll-config-file=../deepstream-app/tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=0
#enable-batch-process applicable to DCF only
enable-batch-process=1

[tests]
file-loop=1

I usually run the file using the command;
deepstream-app -c deepstream_app_source1_facedetectir.txt
This is the output where face is detected.:

Also while running this I see a warning in the console that INT8 not supported trying FP16.

I want to run the same file with the custom model that I trained in any format INT8 or FP16 where the cell below is the only command mentioned in the ipynb for converting. There is no exporting step.

I followed the detectnet_v2 and converted the calibration.bin and now i want to know a way through which i can test on a real time stream either with INT8 or FP16.??

These files where generated by just replacing the paths in the cells to the trained resnet18_detector.etlt in the facenet directory in tlt-experiments.

Morganh · April 19, 2021, 7:31am

As mentioned above, the facenet is actually based on detectnet_v2. So, all the commands in detectnet_v2 can be used in facenet.
After training, you already have a tlt file. Then, see FaceDetect IR Training using TLT 3.0 and Custom Dataset - #4 by Morganh , this is the exporting step. it will generate calibration.bin file and resnet18_detector.etlt file.
With these two files, you can copy them into the device where you want to run inference.
Set int8-calib-file = calibration.bin
tlt-encoded-model = your_etlt_file

If you are running in Nano, since it is not supporting INT8, so the log may prompt “INT8 not supported by platform”.

More, if you run inference in Nano, as mentioned in “A.Geenrate TensorRT engine”, “for the jetson devices, please download the converter for jetson from dev zone link”, please download tlt-converter in Nano, and run it against the etlt file, it will generate .trt engine file (set -t fp16 ). If run against etlt file and calibration.bin file, it will generate int8 .trt engine file (set -t int8).
This is the option for deployment. See DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation

rohitnairkp · April 19, 2021, 8:01am

Hey @Morganh ,

Got the custom trained model working by following the steps mentioned above.

Thanks for your support on this post.

rohitnairkp · April 22, 2021, 2:26am

Hi @Morganh ,

As the facenet custom model is working using the steps mentioned by you.

Queries :

I already have trained gender and age caffe models and its respective prototxts. Is there a way to use those caffe models as seconday models in the same face detection file (deepstream_app_source1_facedetectir.txt) ?
Should i also create calib.bin files for the caffe models? If yes, how can that be done? Is there any script for creating calibration.bin files for these models.?

Morganh · April 22, 2021, 10:49am

For 1), Your facedetectir can work as primary engine, and your gender/age caffe model works as secondary engine. Similar scenario was seen at deepstream-5.0/samples/configs/deepstream-app/config_infer_secondary_vehicletypes.txt
For 2), in TLT, it does not create calb.bin file for any caffe model.

rohitnairkp · April 27, 2021, 9:36am

Hi @Morganh,

I found the scenario where vehiclestypes run as secondary. In my scenario I want to run our exisiting gender model which is in caffe. It has the model file and the prototxt file as mentioned in the vehicle scenario. But the only file missing is the mean.ppm as there is only mean.binaryproto in the caffe model that I have. What can be done to generate the other mean file suitable for the secondary classifier.

Morganh · April 27, 2021, 9:47am

If you are taking the caffe model as the secondary classifier, actually it is not a TLT topic, I observe that you already create a topic in deepstream forum for help. Please refer to Mean file "mean.ppm" of deepstream - #5 by Amycao too. It is talking about how to generate .ppm file.
More searching result locate at Search results for 'mean.ppm #intelligent-video-analytics:deepstream-sdk order:latest_topic' - NVIDIA Developer Forums

Topic		Replies	Views
FaceDetect-IR TAO Toolkit	4	720	October 12, 2021
FaceDetectIR inference with no TLT, DS TensorRT jetpack , python	3	470	October 12, 2021
transfert learning toolkit-> export model TAO Toolkit	11	3738	October 12, 2021
Output Tensor in FaceDetectIR DeepStream SDK camera , python	13	1708	October 12, 2021
Cant get to work facemak NVIDIA-AI-IOT/face-mask-detection Jetson Xavier NX neural-network-framework	7	958	October 18, 2021
Integrating Tao Models (detectnet_v2) into Deepstream SDK TAO Toolkit tao , deepstream , jetson-nano	11	1078	March 24, 2023
Run any pretrained model : People net, Facenet etc. using webcam? DeepStream SDK	17	1365	October 12, 2021
Deepstream: create custom age gender model with TLT TAO Toolkit jetson-inference	14	3297	October 12, 2021
FaceDetect Pre-Trained model implementation using DS DeepStream SDK	26	1236	July 30, 2023
Creating a Real-Time License Plate Detection and Recognition App TAO Toolkit	41	5704	October 12, 2021

FaceDetect IR Training using TLT 3.0 and Custom Dataset

Hi ,

Hardware - DGPU

GPU - Tesla T4

Queries

Appreciate if anyone guides me through.

Thanks!

Related topics