Custom trained SSD inception model in tensorRT c++ version

Description

Custom trained ssd inception v2 works well in tensorRT python version .
But it gives no detections in c++ version (sampleUFFSSD) of tensorRT.
What could be the problem??
Could somebody provide me the possible solution for this?

Environment

TensorRT Version: 6.0
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.1.4.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

uff conversion script here.

configuration_uff.txt (3.4 KB)

tensorflow pb file here:
https://drive.google.com/file/d/1KwsmAJP9F4T7taZWlOGUWCZ8hk0fhZPD/view?usp=sharing

Hi @god_ra,
We are currently checking on this. Please allow us some time.
Thanks!

@AakankshaS,
Okay. I hope to get the solution for this from you soon.
To elaborate the issue,
When I run the inference, There are no errors or warnings. Just the detection is not happening.
But when I give threshold value as 0.05f instead of 0.5f. I get some detections. But it should not be the case right…

Because 0.05f is 5% threshold. whereas 0.5f is 50% threshold value.

But the same uff model works well with python version with 0.5f threshold value. (everything is normal in python)

I think there should be something wrong with the detection score values in c++ version.
But I am not able to find it exactly.

Hope you find it and let us know the possible solution.

Moving to Jetson Nano forum for resolution.

Hi,

We are going to reproduce this issue.
Will share more information with you later.

Thanks.

Hi,

We try to access the model above but the link is not available any more.
Could you help to reflash it?

Thanks.

Hi @AastaLLL,
Please find the file here. : frozen_inference_graph.pb

It has 91 classes.
I get no errors. also no detections.

Let me know if you need any test image to check.

Hi,

We can reproduce this issue in our environment with the default sampleUffSSD example.
It seems that the output confidence is somehow extremely small (<0.05) so no valid bounding box is generated.
Usually, this issue is related to the pre-process of the image.

Remember that you can get the correct output through TensorRT python interface.
Would you mind to share the python sample you used so we can find the difference.

Thanks.

I used this repo for python inference.

Everything worked good there.

You can reproduce the results without any problems there.

Hi,

We feed the uff model generated by your .pb file into the python sample but still get the small output value.
Based on this observation, the issue should come from .pb->.uff.

execute times 0.012755393981933594
0.01844021
0.014392335
0.012673735
0.010986943
0.0106114205
...

May I know the config you used in TRT_object_detection?
The default configure cannot work for your model so we assume you have made some modification.
If yes, would you mind to share the modified model_ssd_mobilenet_v2_coco_2018_03_29.py with us?

EDIT:
We still get very small value when using the configure file above.
Not sure what class are you training for. Would you mind to attach some test image for us?

Thanks.

Hi,

Thanks for your testing image.
We can generate the correct bounding box with TensorRT C++ API now.

Please check following for the detail steps:

1. Generate uff model.

Please use this config.py.txt (2.5 KB) and with the following command:

sudo python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py frozen_inference_graph.pb -o sample_ssd_relu6.uff -O NMS -p config.py

2. Prepare data

Please store the sample_ssd_relu6.uff into /usr/src/tensorrt/data/ssd.
Please store your testing image as test.jpeg and save to /usr/src/tensorrt/data/ssd also.

3. Apply following patch to our sampleUffSSD sample

diff --git a/Makefile.config b/Makefile.config
index a2a5ca8..dfd6ed9 100644
--- a/Makefile.config
+++ b/Makefile.config
@@ -141,7 +141,7 @@ ifneq ($(shell uname -m), $(TARGET))
   LIBPATHS += -L"../lib/stubs" -L"../../lib/stubs" -L"/usr/lib/$(DLSW_TRIPLE)/stubs" -L"/usr/lib/$(DLSW_TRIPLE)" -L"/usr/lib/$(CUBLAS_TRIPLE)/stubs" -L"/usr/lib/$(CUBLAS_TRIPLE)"
   LIBPATHS += -L"$(CUDA_INSTALL_DIR)/targets/$(CUDA_TRIPLE)/$(CUDA_LIBDIR)/stubs" -L"$(CUDA_INSTALL_DIR)/targets/$(CUDA_TRIPLE)/$(CUDA_LIBDIR)"
 endif
-INCPATHS += -I"../common" -I"$(CUDA_INSTALL_DIR)/include" -I"$(CUDNN_INSTALL_DIR)/include" -I"../include" -I"../../include" -I"../../parsers/onnxOpenSource"
+INCPATHS += -I"../common" -I"$(CUDA_INSTALL_DIR)/include" -I"$(CUDNN_INSTALL_DIR)/include" -I"../include" -I"../../include" -I"../../parsers/onnxOpenSource" -I"/usr/include/opencv4"
 LIBPATHS += -L"$(CUDA_INSTALL_DIR)/$(CUDA_LIBDIR)" -Wl,-rpath-link="$(CUDA_INSTALL_DIR)/$(CUDA_LIBDIR)"
 LIBPATHS += -L"$(CUDNN_INSTALL_DIR)/$(CUDNN_LIBDIR)" -Wl,-rpath-link="$(CUDNN_INSTALL_DIR)/$(CUDNN_LIBDIR)"
 LIBPATHS += -L"../lib" -L"../../lib" -L"$(TRT_LIB_DIR)" -Wl,-rpath-link="$(TRT_LIB_DIR)" $(STUBS_DIR)
@@ -223,12 +223,12 @@ ifeq ($(TARGET), qnx)
   COMMON_FLAGS += -D_POSIX_C_SOURCE=200112L -D_QNX_SOURCE -D_FILE_OFFSET_BITS=64 -fpermissive
 endif
 
-COMMON_LD_FLAGS += $(LIBPATHS) -L$(OUTDIR)
+COMMON_LD_FLAGS += $(LIBPATHS) -L$(OUTDIR) `pkg-config --libs opencv4`
 
 OBJDIR = $(call concat,$(OUTDIR),/chobj)
 DOBJDIR = $(call concat,$(OUTDIR),/dchobj)
 
-COMMON_LIBS += $(CUDART_LIB)
+COMMON_LIBS += $(CUDART_LIB) `pkg-config --cflags opencv4`
 ifneq ($(SAFE_PDK),1)
   COMMON_LIBS += $(CUBLAS_LIB) $(CUDNN_LIB)
 endif
diff --git a/sampleUffSSD/sampleUffSSD.cpp b/sampleUffSSD/sampleUffSSD.cpp
index 97dcead..8078849 100644
--- a/sampleUffSSD/sampleUffSSD.cpp
+++ b/sampleUffSSD/sampleUffSSD.cpp
@@ -29,6 +29,9 @@
 #include "common.h"
 #include "logger.h"
 
+#include "opencv2/highgui.hpp"
+#include "opencv2/imgproc.hpp"
+
 #include "NvInfer.h"
 #include "NvUffParser.h"
 #include <cuda_runtime_api.h>
@@ -39,6 +42,16 @@
 #include <sstream>
 
 const std::string gSampleName = "TensorRT.sample_uff_ssd";
+void readImage(const std::string& filename, cv::Mat &image)
+{
+    image = cv::imread(filename);
+    if( image.empty() )
+    {
+        std::cout << "Cannot open image " << filename << std::endl;
+        exit(0);
+       }
+       cv::resize(image, image, cv::Size(300,300));
+}
 
 //!
 //! \brief The SampleUffSSDParams structure groups the additional parameters required by
@@ -95,6 +108,7 @@ private:
 
     std::shared_ptr<nvinfer1::ICudaEngine> mEngine; //!< The TensorRT engine used to run the network
 
+    cv::Mat image;
     //!
     //! \brief Parses an UFF model for SSD and creates a TensorRT network
     //!
@@ -290,25 +304,26 @@ bool SampleUffSSD::processInput(const samplesCommon::BufferManager& buffers)
     const int batchSize = mParams.batchSize;
 
     // Available images
-    std::vector<std::string> imageList = {"dog.ppm", "bus.ppm"};
+    std::vector<std::string> imageList = {"test.jpeg"};
     mPPMs.resize(batchSize);
     assert(mPPMs.size() <= imageList.size());
     for (int i = 0; i < batchSize; ++i)
     {
-        readPPMFile(locateFile(imageList[i], mParams.dataDirs), mPPMs[i]);
+        readImage(locateFile(imageList[i], mParams.dataDirs), image);
     }
 
     float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer(mParams.inputTensorNames[0]));
     // Host memory for input buffer
-    for (int i = 0, volImg = inputC * inputH * inputW; i < mParams.batchSize; ++i)
+    for (int i = 0, volImg = inputH * inputW; i < mParams.batchSize; ++i)
     {
-        for (int c = 0; c < inputC; ++c)
+        for (unsigned j = 0, volChl = inputH * inputW; j < inputH; ++j)
         {
-            // The color image to input should be in BGR order
-            for (unsigned j = 0, volChl = inputH * inputW; j < volChl; ++j)
-            {
-                hostDataBuffer[i * volImg + c * volChl + j]
-                    = (2.0 / 255.0) * float(mPPMs[i].buffer[j * inputC + c]) - 1.0;
+                   for( unsigned k = 0; k < inputW; ++ k)
+                       {
+                cv::Vec3b bgr = image.at<cv::Vec3b>(j,k);
+                hostDataBuffer[i * volImg + 0 * volChl + j * inputW + k] = (2.0 / 255.0) * float(bgr[2]) - 1.0;
+                hostDataBuffer[i * volImg + 1 * volChl + j * inputW + k] = (2.0 / 255.0) * float(bgr[1]) - 1.0;
+                hostDataBuffer[i * volImg + 2 * volChl + j * inputW + k] = (2.0 / 255.0) * float(bgr[0]) - 1.0;
             }
         }
     }
@@ -350,7 +365,7 @@ bool SampleUffSSD::verifyOutput(const samplesCommon::BufferManager& buffers)
     {
         int numDetections = 0;
         // at least one correct detection
-        bool correctDetection = false;
+        bool correctDetection = true;
 
         for (int i = 0; i < keepCount[p]; ++i)
         {
@@ -360,29 +375,27 @@ bool SampleUffSSD::verifyOutput(const samplesCommon::BufferManager& buffers)
                 continue;
             }
 
+            std::cout << det[2] << std::endl;
             // Output format for each detection is stored in the below order
             // [image_id, label, confidence, xmin, ymin, xmax, ymax]
             int detection = det[1];
             assert(detection < outputClsSize);
-            std::string storeName = classes[detection] + "-" + std::to_string(det[2]) + ".ppm";
+            std::string storeName = "class" + std::to_string(detection) + "-" + std::to_string(det[2]) + ".jpg";
 
             numDetections++;
-            if ((p == 0 && classes[detection] == "dog")
-                || (p == 1 && (classes[detection] == "truck" || classes[detection] == "car")))
-            {
-                correctDetection = true;
-            }
 
-            sample::gLogInfo << "Detected " << classes[detection].c_str() << " in the image " << int(det[0]) << " ("
-                     << mPPMs[p].fileName.c_str() << ")"
+            sample::gLogInfo << "Detected class" << std::to_string(detection) << " in the image " << int(det[0])
                      << " with confidence " << det[2] * 100.f << " and coordinates (" << det[3] * inputW << ","
                      << det[4] * inputH << ")"
                      << ",(" << det[5] * inputW << "," << det[6] * inputH << ")." << std::endl;
 
             sample::gLogInfo << "Result stored in " << storeName.c_str() << "." << std::endl;
 
-            samplesCommon::writePPMFileWithBBox(
-                storeName, mPPMs[p], {det[3] * inputW, det[4] * inputH, det[5] * inputW, det[6] * inputH});
+            cv::Mat out;
+                       image.copyTo(out);
+                       cv::rectangle(out, cv::Rect(det[3]*inputW, det[4]*inputH, det[5]*inputW-det[3]*inputW, det[6]*inputH-det[4]*inputH),
+                                            cv::Scalar(rand() % 256, rand() % 256, rand() % 256), 2);
+                       cv::imwrite(storeName, out);
         }
         pass &= correctDetection;
         pass &= numDetections >= 1;
@@ -413,7 +426,7 @@ SampleUffSSDParams initializeSampleParams(const samplesCommon::Args& args)
     params.uffFileName = "sample_ssd_mobilenet_v2.uff";
     params.labelsFileName = "ssd_coco_labels.txt";
     params.inputTensorNames.push_back("Input");
-    params.batchSize = 2;
+    params.batchSize = 1;
     params.outputTensorNames.push_back("NMS");
     params.outputTensorNames.push_back("NMS_1");
     params.dlaCore = args.useDLACore;

4. Testing

$ cd /usr/src/tensorrt/samples/
$ make
$ cd /usr/src/tensorrt/bin/
$ ./sample_uff_ssd

We can get the similar result as the TRT_object_detection python sample generated.

0.999016
[08/17/2020-17:10:06] [I] Detected class18 in the image 0 with confidence 99.9016 and coordinates (54.1487,95.957),(83.7936,205.902).
[08/17/2020-17:10:06] [I] Result stored in class18-0.999016.jpg.
0.996855
[08/17/2020-17:10:06] [I] Detected class3 in the image 0 with confidence 99.6855 and coordinates (155.734,87.6828),(187.611,189.075).
[08/17/2020-17:10:06] [I] Result stored in class3-0.996855.jpg.
0.969117
[08/17/2020-17:10:06] [I] Detected class10 in the image 0 with confidence 96.9117 and coordinates (204.955,85.7581),(228.593,189.347).
[08/17/2020-17:10:06] [I] Result stored in class10-0.969117.jpg.
0.968491
[08/17/2020-17:10:06] [I] Detected class22 in the image 0 with confidence 96.8491 and coordinates (124.306,89.2235),(158.748,194.497).
[08/17/2020-17:10:06] [I] Result stored in class22-0.968491.jpg.
0.960414
[08/17/2020-17:10:06] [I] Detected class37 in the image 0 with confidence 96.0414 and coordinates (81.1726,96.9146),(105.868,207.71).
[08/17/2020-17:10:06] [I] Result stored in class37-0.960414.jpg.
0.901776
[08/17/2020-17:10:06] [I] Detected class2 in the image 0 with confidence 90.1776 and coordinates (184.289,84.7503),(209.169,188.614).
[08/17/2020-17:10:06] [I] Result stored in class2-0.901776.jpg.
0.901547
[08/17/2020-17:10:06] [I] Detected class15 in the image 0 with confidence 90.1547 and coordinates (105.564,95.422),(133.682,208.951).
[08/17/2020-17:10:06] [I] Result stored in class15-0.901547.jpg.
0.895801
[08/17/2020-17:10:06] [I] Detected class38 in the image 0 with confidence 89.5801 and coordinates (23.4149,77.7281),(56.8211,231.343).
[08/17/2020-17:10:06] [I] Result stored in class38-0.895801.jpg.
0.715208
[08/17/2020-17:10:06] [I] Detected class7 in the image 0 with confidence 71.5208 and coordinates (227.963,78.5744),(252.259,184.704).
[08/17/2020-17:10:06] [I] Result stored in class7-0.715208.jpg.
&&&& PASSED TensorRT.sample_uff_ssd # ./sample_uff_ssd

Here are some output result for your reference:

class2-0.901776 class3-0.996855 class7-0.715208 class10-0.969117 class15-0.901547 class18-0.999016 class22-0.968491 class37-0.960414 class38-0.895801

Thanks.

hi @god_ra. Did you successfully adapt custom trained SSD inception model to TensorRT?

@AastaLLL,
Thank you.

Now I could able to inference in TensorRT.

I have a question regarding this preprocess step that you mentioned above…
Is there any other method that we can use other than those 3 for-loop solution that you mentioned above.??

Let me know any solution to optimize it even better.

Hi,

Suppose yes.

You can find the corresponding OpenCV function in C++ like this:

image = (2.0/255.0) * image - 1.0
image = image.transpose((2, 0, 1))

Thanks.

I have found this, : Operations on Arrays — OpenCV 2.4.13.7 documentation

The function scaleAdd does multiplication and addition.
exttt{dst} (I)= exttt{scale} dot exttt{src1} (I) + exttt{src2} (I)

This our case,
scale is 2.0/255.0
src1 is image
src 2 is matrix with -1 values.

But How can I use it in this example to process input image and store it to data buffer.

Could you please give some suggestion or code snippet to this…?

Hi,

You can create a cv::Mat image to store the output.
And pass the pointer to hostDataBuffer via [var].data.
https://docs.opencv.org/4.1.0/d3/d63/classcv_1_1Mat.html#a4d33bed1c850265370d2af0ff02e1564

Thanks.

Hey @god_ra,

Can you tell me the steps that you followed and changes that are required to the config file, tf version, zoo model version used etc.? Thanks!