how to write config.py for converting ssd-mobilenetv2 to uff format

I have read the topic:
https://devtalk.nvidia.com/default/topic/1049802/jetson-nano/object-detection-with-mobilenet-ssd-slower-than-mentioned-speed/post/5327974/#5327974

In the sampleUffSSD_rect sample,there is a sample_unpruned_mobilenet_v2.uff.It worked well.
But when I use the command:

convert-to-uff --input-file frozen_inference_graph.pb -O NMS -p config.py

The new uff can not work,when I run ./sampleUffSSD_rect.The error is:

../data/ssd/sample_unpruned_mobilenet_v2.uff
Registering UFF model
Registered Input
Registered output NMS
Creating engine
Begin parsing model...
ERROR: Parameter check failed at: ../builder/Layers.h::setAxis::315, condition: axis>=0
End parsing model...
Begin building engine...
ERROR: Concatenate/concat: all concat input tensors must have the same dimensions except on the concatenation axis
ERROR: Could not compute dimensions for Concatenate/concat, because the network is not valid
Time lapsed to create an engine: 26.1773ms
INTERNAL_ERROR: sample_uff_ssd: Unable to create engine
sample_uff_ssd_rect: sampleUffSSD.cpp:585: int main(int, char**): Assertion `tmpEngine != nullptr' failed.

And I read the README.txt of sampleUffSSD_rect sample,it says:

Steps to generate UFF file:
    0. Make sure you have the UFF converter installed. For installation instructions, see:
        https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/#python and click on the 'TensorRT Python API' link.

    1. Get the pre-trained Tensorflow model (ssd_inception_v2_coco)
 from:
        http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz

    2. Call the UFF converter with the preprocessing flag set (-p [config_file]).
        The config.py script specifies the preprocessing operations necessary for SSD TF graph.
        It must be copied to the working directory for the file to be imported properly.
        The plugin nodes and plugin parameters used in config.py should match the registered plugins
        in TensorRT. Please read the plugins documentation for more details.

        'convert-to-uff --input-file frozen_inference_graph.pb -O NMS -p config.py'

I think the config.py of sampleUffSSD_rect sample is not for ssd-mobilenetv2.It is for ssd_inception_v2_coco.

Can you tell me how to write config.py for ssd-mobilenetv2.

Hi,

This is a known issue in UFF parser and will be fixed in our next release.

Currently, you can hack the TF graph and change the batch dimension to anything other than -1.
And combined the attached config file to bypass the error.

Thanks.
config.py.zip (941 Bytes)

Hello,

After applying the above config operations, I am running into the following error:

python3: nmsPlugin.cpp:135: virtual void nvinfer1::plugin::DetectionOutput::configureWithFormat(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, nvinfer1::DataType, nvinfer1::PluginFormat, int): Assertion `numPriors * numLocClasses * 4 == inputDims[param.inputOrder[0]].d[0]’ failed.
Aborted (core dumped)

I understand, this is due to incorrect number of classes being mentioned. I have tried multiple numClasses in the NMS_TRT plugin option, but all of them are returning this error. Another post where this error was mentioned is https://devtalk.nvidia.com/default/topic/1047429/tensorrt/sampleuffssd-with-custom-ssd_mobilenet_v1-model/.

Could you please take a look at this.

Hi,

I was able to carry out the conversion and get it to work successfully with some minor changes using the config.py that was shared in the prior post.

I had to change the values passed to inputOrder when creating the NMS plugin node. In the original script, the inputOrder was [0, 2, 1]. I had to change it to [1, 0, 2]. The inputOrder specifies the order of inputs {loc_data, conf_data, priorbox_data}. You can work out the right order by passing the -t option to convert-to-uff to get a pbtxt file and inspecting the order of the inputs to the NMS node.

For reference, this is how my NMS plugin node was created:

NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=91,
    inputOrder=[1, 0, 2], #This has to correspond to the right order.
    confSigmoid=1,
    isNormalized=1)

Hope this helps.

hi,
thank you for your replies.

@AastaLLL and @avishek.alex15
Could you be kind to tell me HOW TO “hack the TF graph and change the batch dimension”?

@keewei.lam
Could you be kind to tell me HOW TO write the text file for the -t option of convert-to-uff.

@firefox1200

For my case, I had access to the weights of the network. While exporting the frozen graph, I passed a fixed batch size of 1. That solves the “box_predictor/reshape using -1 more than once” problem.

@keewei.lam
Could you please explain on how to determine the inputOrder, and to which list it should correspond to. Even after changing the input order, I am getting the same error during conversion.

My outputs look like this:

Outputs:  [name: "NMS"
op: "NMS_TRT"
input: "concat_priorbox"
input: "Squeeze"
input: "concat_box_conf"
attr {
  key: "backgroundLabelId_u_int"
  value {
    i: 0
  }
}
attr {
  key: "confSigmoid_u_int"
  value {
    i: 1
  }
}
attr {
  key: "confidenceThreshold_u_float"
  value {
    f: 1e-08
  }
}
attr {
  key: "inputOrder_u_ilist"
  value {
    list {
      i: 1
      i: 0
      i: 2
    }
  }
}
attr {
  key: "isNormalized_u_int"
  value {
    i: 1
  }
}
attr {
  key: "keepTopK_u_int"
  value {
    i: 100
  }
}
attr {
  key: "nmsThreshold_u_float"
  value {
    f: 0.6
  }
}
attr {
  key: "numClasses_u_int"
  value {
    i: 4
  }
}
attr {
  key: "scoreConverter_u_str"
  value {
    s: "SIGMOID"
  }
}
attr {
  key: "shareLocation_u_int"
  value {
    i: 1
  }
}
attr {
  key: "topK_u_int"
  value {
    i: 100
  }
}
attr {
  key: "varianceEncodedInTarget_u_int"
  value {
    i: 0
  }
}
]

Thanks.

@avishek.alex15

The three input nodes to your NMS_TRT node is as follow:

{
   ...
   input: "concat_priorbox"
   input: "Squeeze"
   input: "concat_box_conf"
}

Hence, the inputOrder is [1, 2, 0]. Following this order: {loc_data, conf_data, priorbox_data}.

@keewei.lam

Thank you so much for explaining it. I have successfully converted my model following your instructions. :-)

hi,
Thank you very much,@avishek.alex15 @keewei.lam.
I will try it!

Hi, Have you solved this problem?

Hi, I have made it . You are right!

import os
import sys
import tarfile

import requests
import tensorflow as tf
import tensorrt as trt
import graphsurgeon as gs
import uff

def ssd_unsupported_nodes_to_plugin_nodes(ssd_graph):

    # Find and remove all Assert Tensorflow nodes from the graph
    all_assert_nodes = ssd_graph.find_nodes_by_op("Assert")
    ssd_graph.remove(all_assert_nodes, remove_exclusive_dependencies=True)
    # Find all MultipleGridAnchorGenerator nodes and forward their inputs
    #all_identity_nodes = ssd_graph.find_nodes_by_op("Identity[0-5]")
    #ssd_graph.forward_inputs(all_identity_nodes)

    # Create TRT plugin nodes to replace unsupported ops in Tensorflow graph
    channels = 3
    height = 300
    width = 300

    Input = gs.create_plugin_node(name="Input",
        op="Placeholder",
        dtype=tf.float32,
        shape=[1, channels, height, width])
    PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
        minSize=0.2,
        maxSize=0.95,
        aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
        variance=[0.1,0.1,0.2,0.2],
        featureMapShapes=[19, 10, 5, 3, 2, 1], 
        numLayers=6
    )
    NMS = gs.create_plugin_node(
        name="NMS",
        op="NMS_TRT",
        shareLocation=1,
        varianceEncodedInTarget=0,
        backgroundLabelId=0,
        confidenceThreshold=1e-8,
        nmsThreshold=0.6,
        topK=100,
        keepTopK=100,
        numClasses=91,
        inputOrder=[1, 0, 2],
        confSigmoid=1,
        isNormalized=1,
        scoreConverter="SIGMOID"
    )
    concat_priorbox = gs.create_node(
        "concat_priorbox",
        op="ConcatV2",
        dtype=tf.float32,
        axis=2
    )
    concat_box_loc = gs.create_plugin_node(
        "concat_box_loc",
        op="FlattenConcat_TRT",
        dtype=tf.float32,
        axis=1,
        ignoreBatch=0
    )
    concat_box_conf = gs.create_plugin_node(
        "concat_box_conf",
        op="FlattenConcat_TRT",
        dtype=tf.float32,
        axis=1,
        ignoreBatch=0
    )
    #NMS.input.extend([tensor.op.name for tensor in [concat_box_loc,concat_box_conf,concat_priorbox]])

    # Create a mapping of namespace names -> plugin nodes.
    namespace_plugin_map = {
        "Postprocessor": NMS,
        "Preprocessor": Input,
        "ToFloat": Input,
        "image_tensor": Input,
        "MultipleGridAnchorGenerator": PriorBox,
        "Concatenate/concat": concat_priorbox,
        "Squeeze": concat_box_loc,
        "concat_1": concat_box_conf
    }

    # Create a new graph by collapsing namespaces
    ssd_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    # If remove_exclusive_dependencies is True, the whole graph will be removed!
    ssd_graph.remove(ssd_graph.graph_outputs, remove_exclusive_dependencies=False)
    # Disconnect the Input node from NMS, as it expects to have only 3 inputs
    ssd_graph.find_nodes_by_op('NMS_TRT')[0].input.remove('Input')

    return ssd_graph

def model_to_uff(model_path, output_uff_path, silent=False):

    dynamic_graph = gs.DynamicGraph(model_path)
    dynamic_graph = ssd_unsupported_nodes_to_plugin_nodes(dynamic_graph)
    
    uff.from_tensorflow(
        dynamic_graph.as_graph_def(),
        ['NMS'],
        output_filename=output_uff_path,
        text=False,
        quiet=False
    )

def main():
    model_path = './ssd_frozen_inference_graph.pb'
    uff_path = './ssd_frozen_inference_graph.uff'
    model_to_uff(model_path,uff_path)

if __name__ == '__main__':
    main()

Thanks for posting your advice everyone! I can convert to UFF successfully but then creating an engine fails with the same NMS assertion fail.

sample_uff_ssd: nmsPlugin.cpp:135: virtual void nvinfer1::plugin::DetectionOutput::configureWithFormat(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, nvinfer1::DataType, nvinfer1::PluginFormat, int): Assertion `numPriors * numLocClasses * 4 == inputDims[param.inputOrder[0]].d[0]' failed.
Aborted (core dumped)

My steps:
Download frozen ssd_mobilenet_v2_coco_2018_03_29 graph and convert to UFF using the modified config file.
Create engine using the tensorrt samples. I’ve tried both the sampleUffSSD program and the sampleUffSSD_rect program. My only modifications to those are changing the uff file, setting num class labels to 91, and setting the order to [1,2,0]. Is there something I missed?


Update: Just forgot to update the class labels in the config. I am now achieving 6 ms (150fps) inference time on Jetson Xavier! Perhaps even more throughput could be achieved with a larger batch size.

hey everyone,

thanks for your tips so far.
I can convert my transfer learned ssd mobilenet v2 coco_2018_03_29 frozen graph with one class to uff with this config( pretty much the one out of this thread)
python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py --input-file /usr/src/tensorrt/samples/python/uff_ssd/workspace/models/SSD_T/frozen_inference_graph.pb -0 NMS -p /usr/src/tensorrt/samples/python/uff_ssd/config.py -t

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])
PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
    numLayers=6,
    minSize=0.2,
    maxSize=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    variance=[0.1,0.1,0.2,0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])
NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=2,
    inputOrder=[1, 0, 2], # 021 is def 012 is the prob right or 102
    confSigmoid=1,
    isNormalized=1,
    scoreConverter="SIGMOID")
concat_priorbox = gs.create_node(name="concat_priorbox", op="ConcatV2", dtype=tf.float32, axis=2)

concat_box_loc = gs.create_plugin_node("concat_box_loc", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

concat_box_conf = gs.create_plugin_node("concat_box_conf", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "Concatenate": concat_priorbox,
    "Squeeze": concat_box_loc, #"concat": concat_box_loc,
    "concat_1": concat_box_conf
}

def preprocess(dynamic_graph):
    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)
    # Disconnect the Input node from NMS, as it expects to have only 3 inputs.
    dynamic_graph.find_nodes_by_op("NMS_TRT")[0].input.remove("Input")

I changed the num of class to 2 and the input order.
I think 1,0,2 is the right inputOrder since my .pbtxt says

id: "NMS"
    inputs: "concat_box_conf"
    inputs: "concat_box_loc"
    inputs: "concat_priorbox"

when i try to run it with
/usr/src/tensorrt/samples/python/uff_ssd$ sudo python3 detect_objects.py
i get this error

python3: nmsPlugin.cpp:136: virtual void nvinfer1::plugin::DetectionOutput::configureWithFormat(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, nvinfer1::DataType, nvinfer1::PluginFormat, int): Assertion `numPriors * param.numClasses == inputDims[param.inputOrder[1]].d[0]' failed.
Aborted

any clue, what I am doing wrong?
I’ve read the part with hacking your frozen graph to change batch size from -1 to sth else. But I didn’t really understand how to do it.

Would be grateful if anyone can help me out here :)

Track the following issue on this topic:
https://devtalk.nvidia.com/default/topic/1051455/jetson-nano/problems-with-ssd-mobilenet-v2-uff/

Thanks.

Thanks keewei.lam,

Most of people having trouble when trying to run tensorflow custom model in tensorrt, and I think this little information will solve most of them :)

@atyshka can you share your modified config?

Hello, I trained custom mobilenetv2_fn model with 6 classes on Tensorflow 1.15 version.
And then, I converted to pb file.
But when building TensorRT model, I have the following error.

Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS_TRT yet.
Converting NMS as custom op: NMS_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_conf as custom op: FlattenConcat_TRT
Warning: No conversion function registered for layer: Unpack yet.
Converting Preprocessor/unstack as custom op: Unpack
Warning: No conversion function registered for layer: GridAnchor_TRT yet.
Converting GridAnchor as custom op: GridAnchor_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_loc as custom op: FlattenConcat_TRT
DEBUG [/usr/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘NMS’] as outputs
No. nodes: 612
UFF Output written to ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph_1.uff
[TensorRT] ERROR: UffParser: Validator error: Preprocessor/unstack: Unsupported operation _Unpack
Building TensorRT engine, this may take a few minutes…
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
File “main.py”, line 371, in
buf = trt_engine.serialize()
AttributeError: ‘NoneType’ object has no attribute ‘serialize’

Please help me how to fix it.
I attach pb file and python file I used.
I will expect good news ASAP.

Hi andrey201917,

Please help to open a new topic for your issue. Thanks