How do I replace it with my own SSD target detection model?

I retrained my detection model using transfer learning. I don’t think this will change the output layer and input layer of the uff file, but I replaced the uff-file in objectDetector_SSD with my uff file and got the following error :

Using winsys: x11

Creating LL OSD context new

0:00:01.116309090  7904      0xba76cd0 WARN                 nvinfer gstnvinfer.cpp:515:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:useEngineFile(): Failed to read from model engine file

0:00:01.116429658  7904      0xba76cd0 INFO                 nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files

0:00:02.340588366  7904      0xba76cd0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): UffParser: Validator error: Cast: Unsupported operation _Cast

0:00:02.364366344  7904      0xba76cd0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:generateTRTModel(): Failed to parse UFF file: incorrect file or incorrect input/output blob names

0:00:02.367971775  7904      0xba76cd0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Failed to create engine from model files

0:00:02.368988445  7904      0xba76cd0 WARN                 nvinfer gstnvinfer.cpp:692:gst_nvinfer_start:<primary_gie_classifier> error: Failed to create NvDsInferContext instance

0:00:02.369046489  7904      0xba76cd0 WARN                 nvinfer gstnvinfer.cpp:692:gst_nvinfer_start:<primary_gie_classifier> error: Config file path: /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_SSD/config_infer_primary_ssd.txt, 
NvDsInfer Error: NVDSINFER_TENSORRT_ERROR
** ERROR: <main:651>: Failed to set pipeline to PAUSED

Quitting

ERROR from primary_gie_classifier: Failed to create NvDsInferContext instance

Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(692): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie_classifier:

Config file path: /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_SSD/config_infer_primary_ssd.txt, NvDsInfer Error: NVDSINFER_TENSORRT_ERROR
App run failed

Is there something wrong with the way I configured it? Is there any solution?
Here is the content of my config_infer_primary_ssd configuration file:

[property]
gpu-id=0
net-scale-factor=0.0078431372
offsets=127.5;127.5;127.5
model-color-format=0
model-engine-file=sample_ssd_relu6.uff_b1_fp32.engine
labelfile-path=ssd_coco_labels.txt
uff-file=sample_ssd_relu6.uff
uff-input-dims=3;300;300;0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=91
interval=0
gie-unique-id=1
is-classifier=0
output-blob-names=MarkOutput_0
parse-bbox-func-name=NvDsInferParseCustomSSD
custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so

[class-attrs-all]
threshold=0.5
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

## Per class configuration
#[class-attrs-2]
#threshold=0.6
#roi-top-offset=20
#roi-bottom-offset=10
#detected-min-w=40
#detected-min-h=40
#detected-max-w=400
#detected-max-h=800

I only changed the uff-file.

Hi,

In new TensorFlow API, ToFloat operation is replaced by the Cast.
Here are two alternatives for your reference:

1. Use older TensorFlow version.
Most of our users use v1.13.1 and the serialized .pb file can work correctly with TensorRT.

2. Update the config.py file when generating the uff file.
Suppose it should work by replacing ToFloat with Cast.

Thanks.

Thank you,AastaLLL.
I used Tensorflow 1.14.0, I would switch to 1.13.1 for training, and test the pb file in TensorRT.
Which config.py file needs to be changed after generating the uff file? Can you tell me?
Thank you so much, best wishes for you.

Hi,
I re-used tensorflow1.13.1 migration learning ssd_inception_v2_coco_2017_11_17, but the previous error still appears. What I want to know is how to test that the pb file works normally on TensorRT. Can you tell me how to do it?
Thank you very much.

Hi,

Could you share your .pb file with us?
We want to check how to modify the config.py for your SSD model.

Thanks.

I’m sorry to reply to you after so long.
My pb model is
pb.zip (47.9 MB)

Hi,

I can convert your model into .uff without issue.
Please try the following command:

sudo python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py [INPUT] -o [OUTPUT] -O NMS -p /usr/src/tensorrt/samples/sampleUffSSD/config.py

Ex.

sudo python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py frozen_inference_graph.pb -o output.uff -O NMS -p /usr/src/tensorrt/samples/sampleUffSSD/config.py

Thanks.

Sorry to have restored you so long.
I can also convert this pb file into a uff file. But this uff file cannot be used in deepstream SDK. Can you try it in SSD?

Hello, I’m sorry to bother you.
Have you tried adding the converted uff file to deepstream? I urgently hope to get your answer. Thanks.

Hi,

Sorry for the late update.

We can reproduce this issue on our environment now.
To give further suggestion, could you tell us how many class your customized model is trained for?

Thanks.

Thank you for your reply.
I used two classes during training, they are bus and jeep.
Also, I want to ask a question. Is there any other way to add my own model to deepstream-SSD?
Or directly get the UFF file that deepstream-SSD can use?

Hi,

TensorRT use uff format to support TensorFlow based model.

This error occurs when compiling the TensorRT engine with your customized uff file.
Let us track this issue further and get back to you asap.

Thanks.

Hi,

We roughly figure out the issue of your problem.

Since TensorFlow operations are updated, there are some incompatibility in the TensorRT.
To fix “Unsupported operation _Cast” issue, please update the config.py with following change:

diff --git a/config.py b/config.py
index 6be4bf0..794afbd 100644
--- a/config.py
+++ b/config.py
@@ -81,7 +81,7 @@ namespace_plugin_map = {
     "MultipleGridAnchorGenerator": PriorBox,
     "Postprocessor": NMS,
     "Preprocessor": Input,
-    "ToFloat": Input,
+    "Cast": Input,
     "image_tensor": Input,
     "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
     "MultipleGridAnchorGenerator/Identity": concat_priorbox,

However, there is one more error looks like this:

[01/21/2020-15:57:57] [V] [TRT] UFFParser: Parsing GridAnchor[Op: GridAnchor_TRT].
[libprotobuf FATAL /externals/protobuf/aarch64/10.0/include/google/protobuf/repeated_field.h:1408] CHECK failed: (index) < (current_size
):
terminate called after throwing an instance of ‘google_private::protobuf::FatalException’
what(): CHECK failed: (index) < (current_size_):
Aborted (core dumped)

This error requires you to retain the model with a updated multiple_grid_anchor_generator.py.

diff --git a/multiple_grid_anchor_generator.py b/multiple_grid_anchor_generator.py
index 86007c9..12da3bc 100644
--- a/multiple_grid_anchor_generator.py
+++ b/multiple_grid_anchor_generator.py
@@ -95,7 +95,8 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator):
       raise ValueError('box_specs_list is expected to be a '
                        'list of lists of pairs')
     if base_anchor_size is None:
-      base_anchor_size = [256, 256]
+      base_anchor_size = [256., 256.]
+    base_anchor_size = tf.constant(base_anchor_size, dtype=tf.float32)
     self._base_anchor_size = base_anchor_size
     self._anchor_strides = anchor_strides
     self._anchor_offsets = anchor_offsets

You can also find more information in this comment:
https://devtalk.nvidia.com/default/topic/1069027/tensorrt/parsing-gridanchor-op-gridanchor_trt-protobuf-repeated_field-h-1408-check-failed-index-lt-current_size-/?offset=3#5415537

Thanks.