DeepStream implementation of working nwesem/mtcnn_facenet_cpp_tensorRT needed

GalibaSashi · August 4, 2020, 8:55am

Deepstream 5.0
TensorRT 7.0
GPU:T4
CUDA 10.2
Hi
I have gone through a working TensorRT GitHub - nwesem/mtcnn_facenet_cpp_tensorRT: Face Recognition on NVIDIA Jetson (Nano) using TensorRT. I want to implement the same in deepstream. Can you kindly help.

AastaLLL · August 5, 2020, 2:57am

Hi,

We have tested one of MTCNN model, and all layers are supported by the TensorRT.

$ /usr/src/tensorrt/bin/trtexec --deploy=./det1_relu.prototxt --output=conv4-2 --output=prob1

So the workflow should be simple.

1. Please modify from our /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_FasterRCNN/ sample.

2. Implement your bounding box based on the generateBbox.
Please wrap the bounding box below into the format as nvdsinfer_custom_impl_fasterRCNN.

github.com

nwesem/mtcnn_facenet_cpp_tensorRT/blob/main/src/pnet_rt.cpp#L118


      
              pnet_engine.context->enqueue(BatchSize, buffers, stream, nullptr);
              CHECK(cudaMemcpyAsync(this->score_->pdata, buffers[outputProb], BatchSize * OUT_PROB_SIZE * sizeof(float),
                                    cudaMemcpyDeviceToHost, stream));
              CHECK(cudaMemcpyAsync(this->location_->pdata, buffers[outputLocation],
                                    BatchSize * OUT_LOCATION_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));
              cudaStreamSynchronize(stream);
              generateBbox(this->score_, this->location_, scale);
          
          
}
          
          
void Pnet::generateBbox(const struct pBox *score, const struct pBox *location, mydataFmt scale) {
              //for pooling
              int stride = 2;
              int cellsize = 12;
              int count = 0;
              //score p
              mydataFmt *p = score->pdata + score->width * score->height;
              mydataFmt *plocal = location->pdata;
              struct Bbox bbox;
              struct orderScore order;
              for (int row = 0; row < score->height; row++) {

3. Get your mean value based on this implementation.

github.com

nwesem/mtcnn_facenet_cpp_tensorRT/blob/main/src/faceNet.cpp#L124


      
                      currFace.x1 = it->x1;
                      currFace.y1 = it->y1;
                      currFace.x2 = it->x2;
                      currFace.y2 = it->y2;            
                      m_croppedFaces.push_back(currFace);
                  }
              }
              //ToDo align
          }
          
          
void FaceNetClassifier::preprocessFaces() {
              // preprocess according to facenet training and flatten for input to runtime engine
              for (int i = 0; i < m_croppedFaces.size(); i++) {
                  //mean and std
                  cv::cvtColor(m_croppedFaces[i].faceMat, m_croppedFaces[i].faceMat, cv::COLOR_RGB2BGR);
                  cv::Mat temp = m_croppedFaces[i].faceMat.reshape(1, m_croppedFaces[i].faceMat.rows * 3);
                  cv::Mat mean3;
                  cv::Mat stddev3;
                  cv::meanStdDev(temp, mean3, stddev3);
          
          
        double mean_pxl = mean3.at<double>(0);

4. Update config_infer_primary_fasterRCNN.txt file.

[property]
...
offsets=[B/G/R mean-subtraction value]
model-file=[your/model/name].caffemodel
proto-file=[your/proto/name].prototxt
labelfile-path=[your/label/name].txt
num-detected-classes=1
output-blob-names=conv4-2;prob1
parse-bbox-func-name=[your/bbox/function/name]
custom-lib-path=[your/bbox/library/path]
...

Thanks.

GalibaSashi · August 5, 2020, 5:48am

Hi I have tried to implement the same in nvdsparsebbox.cpp.But was not successful. Also can you share how this implementation can be worked upon if only one model is taken.

/*

Copyright (c) 2018-2019, NVIDIA CORPORATION. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the “Software”),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
*/

#include
#include
#include
#include “nvdsinfer_custom_impl.h”
#include “nvdssample_fasterRCNN_common.h”

#define MIN(a,b) ((a) < (b) ? (a) : (b))
#define MAX(a,b) ((a) > (b) ? (a) : (b))
#define CLIP(a,min,max) (MAX(MIN(a, max), min))

/* This is a sample bounding box parsing function for the sample FasterRCNN

detector model provided with the TensorRT samples. */

extern “C”
bool NvDsInferParseCustomFasterRCNN (std::vector const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector &objectList);

/* C-linkage to prevent name-mangling */
extern “C”
bool NvDsInferParseCustomFasterRCNN (std::vector const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector &objectList)
{
static int bboxPredLayerIndex = -1;
static int clsProbLayerIndex = -1;
static int roisLayerIndex = -1;
static const int NUM_CLASSES_FASTER_RCNN = 21;
static bool classMismatchWarn = false;
int numClassesToParse;

if (bboxPredLayerIndex == -1) {
for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
if (strcmp(outputLayersInfo[i].layerName, “bbox_pred”) == 0) {
bboxPredLayerIndex = i;
break;
}
}
if (bboxPredLayerIndex == -1) {
std::cerr << “Could not find bbox_pred layer buffer while parsing” << std::endl;
return false;
}
}

if (clsProbLayerIndex == -1) {
for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
if (strcmp(outputLayersInfo[i].layerName, “cls_prob”) == 0) {
clsProbLayerIndex = i;
break;
}
}
if (clsProbLayerIndex == -1) {
std::cerr << “Could not find cls_prob layer buffer while parsing” << std::endl;
return false;
}
}

if (roisLayerIndex == -1) {
for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
if (strcmp(outputLayersInfo[i].layerName, “rois”) == 0) {
roisLayerIndex = i;
break;
}
}
if (roisLayerIndex == -1) {
std::cerr << “Could not find rois layer buffer while parsing” << std::endl;
return false;
}
}

if (!classMismatchWarn) {
if (NUM_CLASSES_FASTER_RCNN !=
detectionParams.numClassesConfigured) {
std::cerr << “WARNING: Num classes mismatch. Configured:” <<
detectionParams.numClassesConfigured << ", detected by network: " <<
NUM_CLASSES_FASTER_RCNN << std::endl;
}
classMismatchWarn = true;
}

numClassesToParse = MIN (NUM_CLASSES_FASTER_RCNN,
detectionParams.numClassesConfigured);

float *rois = (float *) outputLayersInfo[roisLayerIndex].buffer;
float *deltas = (float *) outputLayersInfo[bboxPredLayerIndex].buffer;
float *scores = (float *) outputLayersInfo[clsProbLayerIndex].buffer;

int stride = 2;
int cellsize = 12;
int count = 0;
//score p
mydataFmt *p = score->pdata + score->width * score->height;
mydataFmt *plocal = location->pdata;
struct Bbox bbox;
struct orderScore order;
for (int row = 0; row < score->height; row++) {
for (int col = 0; col < score->width; col++) {
if (*p > Pthreshold) {
bbox.score = *p;
order.score = *p;
order.oriOrder = count;
bbox.x1 = round((stride * row + 1) / scale);
bbox.y1 = round((stride * col + 1) / scale);
bbox.x2 = round((stride * row + 1 + cellsize) / scale);
bbox.y2 = round((stride * col + 1 + cellsize) / scale);
bbox.exist = true;
bbox.area = (bbox.x2 - bbox.x1) * (bbox.y2 - bbox.y1);
for (int channel = 0; channel < 4; channel++)
bbox.regreCoord[channel] = *(plocal + channel * location->width * location->height);
boundingBox_.push_back(bbox);
bboxScore_.push_back(order);
count++;
}
p++;
plocal++;
}
}
return true;
}

/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomFasterRCNN);

AastaLLL · August 6, 2020, 5:42am

Hi,

Could you share the error message you met with us?
Thanks.

GalibaSashi · August 6, 2020, 5:44am

Parsing error had come. I think it is due to my implementation I had done in previus post

AastaLLL · August 7, 2020, 3:58am

Hi,

Would you mind to share the error log with us?
Thanks.

GalibaSashi · August 7, 2020, 12:26pm

Hi @AastaLLL, Kindly find the error log
Error log
deepstream-app -c deepstream_app_config_fasterRCNN.txt
Warn: ‘threshold’ parameter has been deprecated. Use ‘pre-cluster-threshold’ instead.
Warn: ‘threshold’ parameter has been deprecated. Use ‘pre-cluster-threshold’ instead.
WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:34 [TRT]: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
0:00:01.818410865 27752 0x5567a78c6960 INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1577> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_FasterRCNN/det1_relu1.engine
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x96x128
1 OUTPUT kFLOAT conv4-2 4x43x59
2 OUTPUT kFLOAT prob1 2x43x59

0:00:01.818493980 27752 0x5567a78c6960 INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1681> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_FasterRCNN/det1_relu1.engine
0:00:01.819205855 27752 0x5567a78c6960 INFO nvinfer gstnvinfer_impl.cpp:311:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_FasterRCNN/config_infer_primary_fasterRCNN.txt sucessfully

Runtime commands:
h: Print this help
q: Quit

p: Pause
r: Resume

**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready

** INFO: <bus_callback:167>: Pipeline running

Could not find bbox_pred layer buffer while parsing
0:00:01.967867822 27752 0x556798c486d0 ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:564> [UID = 1]: Failed to parse bboxes using custom parse function
Segmentation fault (core dumped)

AastaLLL · August 10, 2020, 6:09am

Hi,

Please remember update the output layer name used in the customized parser:

if (strcmp(outputLayersInfo[i].layerName, “bbox_pred”) == 0) {
    bboxPredLayerIndex = i;
    break;
}

Suppose the bounding box layer of your model should be conv4-2.
Thanks.

Topic		Replies	Views
Converting Custom RetinaNet model to TensorRT in DeepStream DeepStream SDK tensorrt , neural-network-framework , jetson , deepstream , net	29	101	January 21, 2025
Unable to draw bounding boxes in deepstream using a multi-task classifier model DeepStream SDK	12	1059	December 19, 2022
OTA Update in deepstream-test5 model update segmentation fault DeepStream SDK	11	1027	June 23, 2022
There is a error when run deepstream-mrcnn-app DeepStream SDK	60	3089	October 12, 2021
Run BACK-TO-BACK-DETECTORS REFERENCE APP under DeepStream SDK 5.0 DeepStream SDK	16	998	October 12, 2021
Issue with Bounding Boxes and Object Detection in DeepStream Using YOLOv8 Model DeepStream SDK yolo , deepstream	9	60	April 18, 2025
Runtime errors when running the human pose estimation application DeepStream SDK tensorrt , cuda , ubuntu	25	4816	October 12, 2021
Face detection with deepstream with landmarks DeepStream SDK	17	4064	October 12, 2021
NvDsInferLayerInfo not giving expected no. of outputs DeepStream SDK	60	2081	October 12, 2021
Failed to deploy efficientdet-tf1 in deepstream DeepStream SDK	2	377	November 6, 2023

DeepStream implementation of working nwesem/mtcnn_facenet_cpp_tensorRT needed

Related topics