Conversion of TF-TRT model to Deepstream errors

My pipeline’s inputs are multiple RTSP streams (from IP cameras) on which I would like to perform detection using a neural network followed by tracking and recognition.
My machine specs:

  • 2 RTX 2080TI cards with NVLINK connector between them.
  • Intel I7-9700K CPU.
  • 32 GB RAM.
  • Ubuntu 18.04LTS.
  • Nvidia driver 410.104.
  • CUDA 10.
  • cuDNN 7.5.
  • TensorRT 5.0.2.
  • DS 3.0.
  • Also installed Docker and pulled Nvidia/tensorflow-19.02 image as well as Nvidia/tensorrt-19.01 image.

I followed the Github:
I chose to download the ‘facessd’ model: facessd_mobilenet_v2_quantized_open_image_v4. I assume that since it is on Tensorflow repository it should be compatible to Tensorflow/TensorRT existing framework without implementing new layers.
In order to convert the frozen ‘.pb’ Tensorflow model into an ‘.engine’ file I executed the following code both in the Nvidia/tensorflow-19.02 Docker as well as regular installed environment:

import tensorflow as tf
import tensorflow.contrib.tensorrt as trt

frozen_graph_filename = 'frozen_graph_facessd_mobilenet_v2_quantized_open_image_v4_1_FP16.pb'
graph = tf.Graph()
with graph.as_default():
    with tf.Session() as sess:
        # First deserialize your frozen graph:
        with tf.gfile.GFile(frozen_graph_filename, 'rb') as f:
            graph_def = tf.GraphDef()
        # frozen graph:
        trt_graph = trt.create_inference_graph(

with tf.gfile.GFile('facessd.engine', 'wb') as f:

Following the engine creation I tried to infer with the deepstream-app using:

deepstream-app -c config.txt

We got the following error both in the tensorrt-19.01 docker image as well as regular installed environment:

Using TRT model serialized engine <MY_PATH>/DeepStreamSDK-Tesla-v3.0/DeepStream_Release/samples/configs/deepstream-app/../../models/facessd/frozen_graph_facessd_mobilenet_v2_quantized_open_image_v4_1_FP16_6.engine crypto flags(0)
deepstream-app: engine.cpp:868: bool nvinfer1::rt::Engine::deserialize(const void*, std::size_t, nvinfer1::IGpuAllocator&, nvinfer1::IPluginFactory*): Assertion `size >= bsize && "Mismatch between allocated memory size and expected size of serialized engine."' failed.
Aborted (core dumped)

I also tried creating an engine file in another way using:

convert-to-uff frozen_graph_facessd_mobilenet_v2_quantized_open_image_v4_1_FP16_6.pb -o frozen_graph_facessd_mobilenet_v2_quantized_open_image_v4_1_FP16_6.uff

Which yields the following error:

Converting BoxPredictor_5/ClassPredictor/act_quant/FakeQuantWithMinMaxVars as custom op: FakeQuantWithMinMaxVars
Traceback (most recent call last):
  File "/usr/local/bin/convert-to-uff", line 93, in <module>
  File "/usr/local/bin/convert-to-uff", line 89, in main
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 187, in from_tensorflow_frozen_model
    return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 157, in from_tensorflow
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 94, in convert_tf2uff_graph
    uff_graph, input_replacements, debug_mode=debug_mode)
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 79, in convert_tf2uff_node
    op, name, tf_node, inputs, uff_graph, tf_nodes=tf_nodes, debug_mode=debug_mode)
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 47, in convert_layer
    return cls.registry_[op](name, tf_node, inputs, uff_graph, **kwargs)
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 27, in convert_const
    array = tf2uff.convert_tf2numpy_const_node(tf_node)
  File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/", line 157, in convert_tf2numpy_const_node
    return array.reshape(shape)
ValueError: cannot reshape array of size 7681 into shape (3,3,128,12)

Please advise what am I doing wrong here.



There is some misunderstanding.

Deepstream only supports pure TensorRT model rather than TF-TRT.
From your error log:

deepstream-app: engine.cpp:868: bool nvinfer1::rt::Engine::deserialize(const void*, std::size_t, nvinfer1::IGpuAllocator&, nvinfer1::IPluginFactory*): Assertion `size >= bsize && "Mismatch between allocated memory size and expected size of serialized engine."' failed.

It looks like some layers are not supported by TensorRT and redirect to TensorFlow implementation.
The error indicates that the TensorFlow implementation is not available in the deepstream SDK.

To fix this issue, you will need to add the non-supported layer as a plugin layer.
Here is a tutorial for your reference:



Tried using the github you recommended. I was able to create an engine file for yolov3 and tried to run the example from “deepstream_reference_apps/yolo/samples/objectDetector_YoloV3/” using:

deepstream-app -c deepstream_app_config_yoloV3.txt

I am getting some weird behavior that looks like this:
Do you have any idea why this happens?

Thank you!


You can get some information on this topic:

And here is a change to enable the full-frame YOLO in the deepstream-app.


I followed:
Changed all file according to the 2 patches (offered in the other thread):

Now video is running without detection at all. The above error continues until I change the [primary-gie] in the config to [yoloplugin] and then it does not detect at all.


I managed to find a solution (thanks to Nvidia’s support) using the following steps:

  • The engine was created using the trt-yolo-app while changing the config file’s (yolov3.cfg) width=416 and height=416.
  • I used the deepstream_reference_apps/yolo/samples/objectDetector_YoloV3/deepstream_app_config_yoloV3.txt config file.
  • Then I applied the above patch for deepstream-app and it works with the [primary-gie] settings in the config.

please suggest steps ,
I have trained fire detection.pb model and i want to run this .pb model using deepstream 5.0 ,how it possible?

Hi shankarjadhav232,

Please help to open a new topic for your issue. Thanks.