How to run Nvidia's example torch SSD net on Deepstream-App with objectDetector_SSD's custom plugin

ai12 · April 7, 2021, 12:29am

Please provide complete information as applicable to your setup.

**• Hardware Platform (Jetson / GPU)**nvidia GPU
Net trained on Jetson Xavier NX, deepstream-app running on
• DeepStream Version
5.0.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
7.0.0
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
follow the steps I describe
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello everyone!
I’m trying to run custom models on deepstream app. My current goal is to do it with torch models.
This is what I have done so fr.

Firstly I followed this guide to launch the dustynv/jetson-inference container on a Jetson-Xavier NX.
Then I followed this guide to train a SSD model on the container, then I generated the ONNX file (I used all 6500 images).
Then I moved the ONNX file to a server running a container made from nvcr.io/nvidia/deepstream:5.0.1-20.09-devel image, where I made a copy of objectDetector_SSD example.
Next step was to modify the config files and run the app. The fp16.enfine file was generated successfully, but then, on runtime, I got segmentation fault.
Later , based on this forum answer, I modified nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp, I changed ‘NMS’ and ‘NMS1’ with ‘scores’ and ‘boxes’ respectively.

I can run the app, but bounding-boxes are all wrong (very small, on a left-top corner) and often the app breaks by a segmentation-fault I haven’t been able to figure.
to avoid this segmentation faults, I limited the for on the line 96 of the nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp file, but it appears again if I attempt to increase the net-scale factor.

So, how can I run the torch net I trained on Deepstream-App?
Than you.

This is my app’s config file.
################################################################################
# Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the “Software”),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1
gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=0
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file:/home/<user>/dev/nvidia/samples/streams/sample_720p.mp4
gpu-id=0
cudadec-memtype=0

[source1]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=4
uri=rtsp://<rtsp stream from local camera>
#drop-frame-interval=2
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[streammux]
gpu-id=0
batch-size=1
batched-push-timeout=-1
## Set muxer output width and height
width=1920
height=1080
nvbuf-memory-type=0

[sink0]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=1
sync=1
source-id=0
gpu-id=0

[sink1]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=1
bitrate=3000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming  
rtsp-port=8555
udp-port=5400

[sink2]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=3
sync=1
source-id=0
gpu-id=0
qos=0
nvbuf-memory-type=0
overlay-id=1
container=1
codec=1
output-file=output.mp4

[osd]
enable=1
gpu-id=0
border-width=3
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=1
interval=0
labelfile-path=/home/<user>/dev/nvidia/proyectos/ejemplos/ssd_fruit/labels.txt
model-engine-file=/home/<user>/dev/nvidia/proyectos/ejemplos/ssd_fruit/ssd-mobilenet.onnx_b1_gpu0_fp16.engine
config-file=config_infer_primary_ssd.txt
nvbuf-memory-type=0

[tests]
file-loop=0

and this is the inference config-file

################################################################################
# Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8), model-file-format
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
#   ONNX: onnx-file
#
# Mandatory properties for detectors:
#   num-detected-classes,
#   custom-lib-path,
#   parse-bbox-func-name
#
# Optional properties for detectors:
#   cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
#
# Mandatory properties for classifiers:
#   classifier-threshold, is-classifier
#
# Optional properties for classifiers:
#   classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
#   operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
#   input-object-min-width, input-object-min-height, input-object-max-width,
#   input-object-max-height
#
# Following properties are always recommended:
#   batch-size(Default=1)
#
# Other optional properties:
#   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
#   model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
#   mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
#   custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.

[property]
gpu-id=0
net-scale-factor=0.0078431372
offsets=127.5;127.5;127.5
model-color-format=0
model-engine-file=/home/<user>/dev/nvidia/proyectos/ejemplos/ssd_fruit/ssd-mobilenet.onnx_b1_gpu0_fp16.engine
labelfile-path=/home/<user>/dev/nvidia/proyectos/ejemplos/ssd_fruit/labels.txt
#uff-file=sample_ssd_relu6.uff
infer-dims=3;300;300
#uff-input-order=0
#uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=9
interval=0
gie-unique-id=1
is-classifier=0
#output-blob-names=MarkOutput_0
parse-bbox-func-name=NvDsInferParseCustomSSD
custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
pre-cluster-threshold=0.95
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

## Per class configuration
#[class-attrs-2]
#threshold=0.6
#roi-top-offset=20
#roi-bottom-offset=10
#detected-min-w=40
#detected-min-h=40
#detected-max-w=400
#detected-max-h=800

and this is the nvdsparsebbox_ssd.cpp file with a few changes

NUM_CLASSES_SSD = 9 instead of 91
layer names are compared to ‘scores’ and ‘boxes’
added “|| i<10” on the for cycle
added some printf functions

the rest is the same as the original example file.

/*
    * Copyright (c) 2018-2019, NVIDIA CORPORATION. All rights reserved.
    *
    * Permission is hereby granted, free of charge, to any person obtaining a
    * copy of this software and associated documentation files (the "Software"),
    * to deal in the Software without restriction, including without limitation
    * the rights to use, copy, modify, merge, publish, distribute, sublicense,
    * and/or sell copies of the Software, and to permit persons to whom the
    * Software is furnished to do so, subject to the following conditions:
    *
    * The above copyright notice and this permission notice shall be included in
    * all copies or substantial portions of the Software.
    *
    * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
    * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
    * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
    * DEALINGS IN THE SOFTWARE.
    */
    #include <cstring>
    #include <iostream>
    #include "nvdsinfer_custom_impl.h"
    #define MIN(a,b) ((a) < (b) ? (a) : (b))
    #define MAX(a,b) ((a) > (b) ? (a) : (b))
    #define CLIP(a,min,max) (MAX(MIN(a, max), min))
    /* This is a sample bounding box parsing function for the sample SSD UFF
    * detector model provided with the TensorRT samples. */
    extern "C"
    bool NvDsInferParseCustomSSD (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
            NvDsInferNetworkInfo  const &networkInfo,
            NvDsInferParseDetectionParams const &detectionParams,
            std::vector<NvDsInferObjectDetectionInfo> &objectList);
    /* C-linkage to prevent name-mangling */
    extern "C"
    bool NvDsInferParseCustomSSD (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
            NvDsInferNetworkInfo  const &networkInfo,
            NvDsInferParseDetectionParams const &detectionParams,
            std::vector<NvDsInferObjectDetectionInfo> &objectList)
    {
      static int nmsLayerIndex = -1;
      static int nms1LayerIndex = -1;
      static bool classMismatchWarn = false;
      int numClassesToParse;
      static const int NUM_CLASSES_SSD = 9;
      if (nmsLayerIndex == -1) {
        for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
          if (strcmp(outputLayersInfo[i].layerName, "scores") == 0) {
            nmsLayerIndex = i;
            break;
          }
        }
        if (nmsLayerIndex == -1) {
        std::cerr << "Could not find scores layer buffer while parsing" << std::endl;
        return false;
        }
      }
      if (nms1LayerIndex == -1) {
        for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
          if (strcmp(outputLayersInfo[i].layerName, "boxes") == 0) {
            nms1LayerIndex = i;
            break;
          }
        }
        if (nms1LayerIndex == -1) {
        std::cerr << "Could not find boxes layer buffer while parsing" << std::endl;
        return false;
        }
      }
      if (!classMismatchWarn) {
        if (NUM_CLASSES_SSD !=
            detectionParams.numClassesConfigured) {
          std::cerr << "WARNING: Num classes mismatch. Configured:" <<
            detectionParams.numClassesConfigured << ", detected by network: " <<
            NUM_CLASSES_SSD << std::endl;
        }
        classMismatchWarn = true;
      }
      
      numClassesToParse = MIN (NUM_CLASSES_SSD,
          detectionParams.numClassesConfigured);
          
      int keepCount = *((int *) outputLayersInfo[nms1LayerIndex].buffer);
      float *detectionOut = (float *) outputLayersInfo[nmsLayerIndex].buffer;
      
      for (int i = 0; i < keepCount|| i<10; ++i)
      {
        //printf("paso1: ");
        float* det = detectionOut + i * 7;
        //printf("%f ",*det);
        int classId = det[1];
        if (classId >= numClassesToParse)
        {
          continue;
        }
        float threshold = detectionParams.perClassPreclusterThreshold[classId];
        if (det[2] < threshold)
        {
          continue;
        }
        printf("threshold: %f\n",threshold);
        unsigned int rectx1, recty1, rectx2, recty2;
        NvDsInferObjectDetectionInfo object;
        
        rectx1 = det[3] * networkInfo.width;
        recty1 = det[4] * networkInfo.height;
        rectx2 = det[5] * networkInfo.width;
        recty2 = det[6] * networkInfo.height;
        
        object.classId = classId;
        object.detectionConfidence = det[2];
        
        /* Clip object box co-ordinates to network resolution */
        object.left = CLIP(rectx1, 0, networkInfo.width - 1);
        object.top = CLIP(recty1, 0, networkInfo.height - 1);
        object.width = CLIP(rectx2, 0, networkInfo.width - 1) - object.left + 1;
        object.height = CLIP(recty2, 0, networkInfo.height - 1) - object.top + 1;
        printf("CLASS ID=%d\n",classId);
        printf("CONFIDENCE=%f\n",det[2]);
        printf("BBOX=(%f ,%f); %fx%f\n",object.top,object.left,object.width,object.height);
        printf("-------------\n");
        objectList.push_back(object);
      }
      return true;
    }
    /* Check that the custom function has been defined correctly */
    CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomSSD);

ai12 · April 7, 2021, 12:32am

@dusty_nv, sorry for tagging you here, but I was hoping you may know how to do this, since the nets were trained with your material.
Any help would be deeply appreciated.

dusty_nv · April 7, 2021, 1:09am

Hi @ai12, I haven’t used the pytorch-ssd models with DeepStream before, sorry about that. I’m not sure what code changes would be needed to support it. These pytorch-ssd models do not have NMS clustering inside them, that is manually implemented in the post-processing of jetson-inference.

If possible, you may want to look into training your detection model with TLT (Transfer Learning Toolkit) which is interoperable with DeepStream.

AastaLLL · April 7, 2021, 6:25am

Hi,

Do you mind to share the onnx file with us so we can check it directly?

Thanks.

ai12 · April 7, 2021, 2:21pm

Hi @AastaLLL. thanks you

I don’t mind at all. These are my files. labels.txt (63 Bytes) ssd-mobilenet.onnx (29.3 MB) .

This is the engine file generated.ssd-mobilenet.onnx_b1_gpu0_fp16.engine (15.1 MB)

And this is the last torch epoch of the training. mb1-ssd-Epoch-29-Loss-4.058936367864194.pth (29.3 MB)

By he way, the model is the same you would get by following dustynd’s steps on this guide. I executed everything as it is since the first great goal is to validate an example model on deepstream before training something on my own.

Thank you again.

jpberrios · April 8, 2021, 1:26am

Please also I want to know how to test some pytorch trained network on deepstream, perhaps if exist some tips or layer to add or adjust box or some some tutorial to take a look related to cnn, or rcnn

Best Regards

AastaLLL · April 19, 2021, 9:03am

Hi,

Thanks for your information and model.
We are checking this issue and will share more information with you later.

Thanks.

AastaLLL · April 26, 2021, 9:24am

Hi,

Sorry for keep you waiting. Here is some information of this issue.

The default parser expects to have the NMS result.
However, in the model trained by jetson-inference, the output is the raw confidence and bbox.

We are working on an example for jetson-inference based model.
Will let you know once it is ready.

Thanks.

jpberrios · April 26, 2021, 11:28am

Thank you & Waiting for your example.

AastaLLL · April 28, 2021, 7:33am

Hi,

Thanks for your patience.
Please check below for the patch of jetson-infernece model.
We mainly integrate the parser here into Deepstream objectDetector_SSD example:

diff --git a/nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp b/nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp
index b5e471d..ef4d7e8 100644
--- a/nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp
+++ b/nvdsinfer_custom_impl_ssd/nvdsparsebbox_ssd.cpp
@@ -29,9 +29,49 @@
 #define MAX(a,b) ((a) > (b) ? (a) : (b))
 #define CLIP(a,min,max) (MAX(MIN(a, max), min))
 
+bool intersects(NvDsInferObjectDetectionInfo &a, NvDsInferObjectDetectionInfo &b, float areaThreshold=0.0f)
+{
+  if( b.left>(a.left+a.width) || (b.left+b.width)<a.left || b.top>(a.top+a.height) || (b.top+b.height)<a.top ) // no overlap
+    return false;
+
+  float overlap = (MIN(a.left+a.width, b.left+b.width) - MAX(a.left, b.left)) * (MIN(a.top+a.height, b.top+b.height) - MAX(a.top, b.top));
+  float total   = MAX(a.width*a.height, b.width*b.height);
+  return (overlap/total) > areaThreshold;
+}
+
+bool clusterDetections(std::vector<NvDsInferObjectDetectionInfo> &objectList, NvDsInferObjectDetectionInfo &obj)
+{
+  if( objectList.size()==0 ) return true;
+
+  for( size_t m=0; m<objectList.size(); m++ )
+  {
+    if( intersects(objectList[m], obj) )
+    {
+      if( objectList[m].classId != obj.classId )
+      {
+        if( obj.detectionConfidence > objectList[m].detectionConfidence ) // replace
+        {
+          objectList.at(m) = obj;
+        }
+      }
+      else
+      {
+        float right  = MAX(objectList[m].left+objectList[m].width, obj.left+obj.width);
+        float bottom = MAX(objectList[m].top+objectList[m].height, obj.top+obj.height);
+	objectList[m].left = MIN(objectList[m].left, obj.left);
+	objectList[m].top  = MIN(objectList[m].top, obj.top);
+	objectList[m].width  = right - objectList[m].left;
+        objectList[m].height = bottom - objectList[m].top;
+        objectList[m].detectionConfidence = MAX(objectList[m].detectionConfidence, obj.detectionConfidence);
+      }
+      return false;
+    }
+  }
+  return true;
+}
+
 /* This is a sample bounding box parsing function for the sample SSD UFF
  * detector model provided with the TensorRT samples. */
-
 extern "C"
 bool NvDsInferParseCustomSSD (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
         NvDsInferNetworkInfo  const &networkInfo,
@@ -45,34 +85,34 @@ bool NvDsInferParseCustomSSD (std::vector<NvDsInferLayerInfo> const &outputLayer
         NvDsInferParseDetectionParams const &detectionParams,
         std::vector<NvDsInferObjectDetectionInfo> &objectList)
 {
-  static int nmsLayerIndex = -1;
-  static int nms1LayerIndex = -1;
+  static int confLayerIndex = -1;
+  static int bboxLayerIndex = -1;
   static bool classMismatchWarn = false;
-  int numClassesToParse;
-  static const int NUM_CLASSES_SSD = 91;
+// int numClassesToParse;
+  static const int NUM_CLASSES_SSD = 9;
 
-  if (nmsLayerIndex == -1) {
+  if (confLayerIndex == -1) {
     for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
-      if (strcmp(outputLayersInfo[i].layerName, "NMS") == 0) {
-        nmsLayerIndex = i;
+      if (strcmp(outputLayersInfo[i].layerName, "scores") == 0) {
+        confLayerIndex = i;
         break;
       }
     }
-    if (nmsLayerIndex == -1) {
-    std::cerr << "Could not find NMS layer buffer while parsing" << std::endl;
+    if (confLayerIndex == -1) {
+    std::cerr << "Could not find confidence layer buffer while parsing" << std::endl;
     return false;
     }
   }
 
-  if (nms1LayerIndex == -1) {
+  if (bboxLayerIndex == -1) {
     for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
-      if (strcmp(outputLayersInfo[i].layerName, "NMS_1") == 0) {
-        nms1LayerIndex = i;
+      if (strcmp(outputLayersInfo[i].layerName, "boxes") == 0) {
+        bboxLayerIndex = i;
         break;
       }
     }
-    if (nms1LayerIndex == -1) {
-    std::cerr << "Could not find NMS_1 layer buffer while parsing" << std::endl;
+    if (bboxLayerIndex == -1) {
+    std::cerr << "Could not find bounding box layer buffer while parsing" << std::endl;
     return false;
     }
   }
@@ -87,47 +127,54 @@ bool NvDsInferParseCustomSSD (std::vector<NvDsInferLayerInfo> const &outputLayer
     classMismatchWarn = true;
   }
 
-  numClassesToParse = MIN (NUM_CLASSES_SSD,
-      detectionParams.numClassesConfigured);
 
-  int keepCount = *((int *) outputLayersInfo[nms1LayerIndex].buffer);
-  float *detectionOut = (float *) outputLayersInfo[nmsLayerIndex].buffer;
+  float* conf = (float *) outputLayersInfo[confLayerIndex].buffer;
+  float* bbox = (float *) outputLayersInfo[bboxLayerIndex].buffer;
+  const uint32_t numBoxes = outputLayersInfo[bboxLayerIndex].dims.d[0];
+  const uint32_t numCoord = outputLayersInfo[bboxLayerIndex].dims.d[1];
 
-  for (int i = 0; i < keepCount; ++i)
+  for( uint32_t n=0; n < numBoxes; n++ )
   {
-    float* det = detectionOut + i * 7;
-    int classId = det[1];
-
-    if (classId >= numClassesToParse)
-      continue;
-
-    float threshold = detectionParams.perClassPreclusterThreshold[classId];
-
-    if (det[2] < threshold)
+    uint32_t maxClass = 0;
+    float    maxScore = -1000.0f;
+
+    // class #0 in ONNX-SSD is BACKGROUND (ignored)
+    for( uint32_t m=1; m < NUM_CLASSES_SSD; m++ )
+    {
+      const float score = conf[n * NUM_CLASSES_SSD + m];
+      if( score < detectionParams.perClassPreclusterThreshold[m] )
+        continue;
+
+      if( score > maxScore )
+      {
+        maxScore = score;
+        maxClass = m;
+      }
+    }
+    float threshold = detectionParams.perClassPreclusterThreshold[maxClass];
+    // check if there was a detection
+    if( maxClass <= 0 || maxScore < threshold )
       continue;
 
+    // populate a new detection entry
+    const float* coord = bbox + n * numCoord;
     unsigned int rectx1, recty1, rectx2, recty2;
     NvDsInferObjectDetectionInfo object;
 
-    rectx1 = det[3] * networkInfo.width;
-    recty1 = det[4] * networkInfo.height;
-    rectx2 = det[5] * networkInfo.width;
-    recty2 = det[6] * networkInfo.height;
-
-    object.classId = classId;
-    object.detectionConfidence = det[2];
-
-    /* Clip object box co-ordinates to network resolution */
-    object.left = CLIP(rectx1, 0, networkInfo.width - 1);
-    object.top = CLIP(recty1, 0, networkInfo.height - 1);
-    object.width = CLIP(rectx2, 0, networkInfo.width - 1) -
-      object.left + 1;
-    object.height = CLIP(recty2, 0, networkInfo.height - 1) -
-      object.top + 1;
-
-    objectList.push_back(object);
+    rectx1 = coord[0] * networkInfo.width;
+    recty1 = coord[1] * networkInfo.height;
+    rectx2 = coord[2] * networkInfo.width;
+    recty2 = coord[3] * networkInfo.height;
+
+    object.classId = maxClass-1;
+    object.detectionConfidence = maxScore;
+    object.left   = CLIP(rectx1, 0, networkInfo.width - 1);
+    object.top    = CLIP(recty1, 0, networkInfo.height- 1);
+    object.width  = CLIP(rectx2, 0, networkInfo.width - 1) - object.left+ 1;
+    object.height = CLIP(recty2, 0, networkInfo.height- 1) - object.top + 1;
+    if( clusterDetections(objectList, object) )
+      objectList.push_back(object);
   }
-
   return true;
 }

Thanks.

Topic		Replies	Views
Onnx model on deepstream5.0: Nvinfer error: Could not find NMS layer buffer while parsing DeepStream SDK ssd , pytorch , onnx	15	2587	October 12, 2021
Using custom model in deepstream DeepStream SDK jetson-inference , python , deepstream	40	568	September 10, 2024
How to use onnx file with deepstream-test1-usbcam + Custom models DeepStream SDK	30	4950	October 12, 2021
Port SSD_Mobilenet_V2.pb to be used in Deepstream 5.0? DeepStream SDK	3	635	October 12, 2021
Step-wise procedure to deploy a custom tensorflow 2.4 object detection model in deepstream 5.1 DeepStream SDK tensorrt , tensorflow , jetson-inference	18	2509	September 28, 2021
Tutorial: How to run YOLOv7 on Deepstream DeepStream SDK demos-and-tutorials	19	7253	March 26, 2024
Error in Deepstream Objectdetector-SSD while integrating Custom SSD model DeepStream SDK python , deep-learning	2	549	October 12, 2021
Custom detection ONNX model gives wrong outputs using nvinfer with DeepStream 5.1 DeepStream SDK	17	3046	October 12, 2021
how to deploy a new onnx model or trt model for detection in DeepStream ? DeepStream SDK	7	5901	October 12, 2021
Loading Yolov5 model into DeepStream DeepStream SDK	21	2443	August 3, 2023

How to run Nvidia's example torch SSD net on Deepstream-App with objectDetector_SSD's custom plugin

Related topics