DeepLab not Working on DeepStream5.0.1

• Hardware Platform (Jetson / GPU)
JetsonTx2
• DeepStream Version
5.0.1(Due to Client’s specific needs)
• JetPack Version (valid for Jetson only)
4.4
• TensorRT Version
7.1.3
• Issue Type( questions)
Hi Everyone,

When I run the sample (/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-segmentation-test/) using my own engine file, no error seems to occur but the output only show a black screen.

It’ll be great if I can get some feedback!

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

  1. Download Pretrained Deeplab model(deeplabv3_mnv2_pascal_trainval_2018_01_29) from tensorflow model zoo

  2. Re-export frozen_inference_graph using the checkpoint included in the downloaded file, to match Deepstream’s expected dimensions according to this TOPIC

    python /content/drive/MyDrive/tf1_retrain/research/deeplab/export_model_revised.py \
                            --checkpoint_path=/content/drive/MyDrive/tf1_retrain/deeplab_checkpoints/pascal_mbv2/model.ckpt-30000 \
                            --export_path=/content/drive/MyDrive/tf1_retrain/deeplab_checkpoints/frozen_pascal_mbv2.pb \
                            --model_variant="mobilenet_v2" \
                            --num_classes=21
    

    The export file I used to adjust the input dimensions ( Modified just like this TOPIC)

    # Lint as: python2, python3
    # Copyright 2018 The TensorFlow Authors All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    # ==============================================================================
    """Exports trained model to TensorFlow frozen graph."""
    
    import os
    import tensorflow as tf
    
    from tensorflow.contrib import quantize as contrib_quantize
    from tensorflow.python.tools import freeze_graph
    from deeplab import common
    from deeplab import input_preprocess
    from deeplab import model
    
    slim = tf.contrib.slim
    flags = tf.app.flags
    
    FLAGS = flags.FLAGS
    
    flags.DEFINE_string('checkpoint_path', None, 'Checkpoint path')
    
    flags.DEFINE_string('export_path', None,
                        'Path to output Tensorflow frozen graph.')
    
    flags.DEFINE_integer('num_classes', 21, 'Number of classes.')
    
    flags.DEFINE_multi_integer('crop_size', [512, 512],
                               'Crop size [height, width].')
    
    # For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or
    # rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note
    # one could use different atrous_rates/output_stride during training/evaluation.
    flags.DEFINE_multi_integer('atrous_rates', None,
                               'Atrous rates for atrous spatial pyramid pooling.')
    
    flags.DEFINE_integer('output_stride', 8,
                         'The ratio of input to output spatial resolution.')
    
    # Change to [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] for multi-scale inference.
    flags.DEFINE_multi_float('inference_scales', [1.0],
                             'The scales to resize images for inference.')
    
    flags.DEFINE_bool('add_flipped_images', False,
                      'Add flipped images during inference or not.')
    
    flags.DEFINE_integer(
        'quantize_delay_step', -1,
        'Steps to start quantized training. If < 0, will not quantize model.')
    
    flags.DEFINE_bool('save_inference_graph', False,
                      'Save inference graph in text proto.')
    
    # Input name of the exported model.
    _INPUT_NAME = 'ImageTensor'
    
    # Output name of the exported predictions.
    _OUTPUT_NAME = 'SemanticPredictions'
    _RAW_OUTPUT_NAME = 'RawSemanticPredictions'
    
    # Output name of the exported probabilities.
    _OUTPUT_PROB_NAME = 'SemanticProbabilities'
    _RAW_OUTPUT_PROB_NAME = 'RawSemanticProbabilities'
    
    
    def _create_input_tensors():
        """Creates and prepares input tensors for DeepLab model.
    
        This method creates a 4-D uint8 image tensor 'ImageTensor' with shape
        [1, None, None, 3]. The actual input tensor name to use during inference is
        'ImageTensor:0'.
    
        Returns:
          image: Preprocessed 4-D float32 tensor with shape [1, crop_height,
            crop_width, 3].
          original_image_size: Original image shape tensor [height, width].
          resized_image_size: Resized image shape tensor [height, width].
        """
        # input_preprocess takes 4-D image tensor as input.
        input_image = tf.placeholder(
            tf.float32, [1, 3, 512, 512], name=_INPUT_NAME)
        original_image_size = tf.shape(input_image)[2:4]
        input_image = tf.transpose(input_image, (0, 2, 3, 1))
    
        # Squeeze the dimension in axis=0 since `preprocess_image_and_label` assumes
        # image to be 3-D.
        image = tf.squeeze(input_image, axis=0)
        resized_image, image, _ = input_preprocess.preprocess_image_and_label(
            image,
            label=None,
            crop_height=FLAGS.crop_size[0],
            crop_width=FLAGS.crop_size[1],
            min_resize_value=FLAGS.min_resize_value,
            max_resize_value=FLAGS.max_resize_value,
            resize_factor=FLAGS.resize_factor,
            is_training=False,
            model_variant=FLAGS.model_variant)
        resized_image_size = tf.shape(resized_image)[:2]
    
        # Expand the dimension in axis=0, since the following operations assume the
        # image to be 4-D.
        image = tf.expand_dims(image, 0)
    
        return image, original_image_size, resized_image_size
    
    
    def main(unused_argv):
        tf.logging.set_verbosity(tf.logging.INFO)
        tf.logging.info('Prepare to export model to: %s', FLAGS.export_path)
    
        with tf.Graph().as_default():
            image, image_size, resized_image_size = _create_input_tensors()
    
            model_options = common.ModelOptions(
                outputs_to_num_classes={common.OUTPUT_TYPE: FLAGS.num_classes},
                crop_size=FLAGS.crop_size,
                atrous_rates=FLAGS.atrous_rates,
                output_stride=FLAGS.output_stride)
    
            if tuple(FLAGS.inference_scales) == (1.0,):
                tf.logging.info('Exported model performs single-scale inference.')
                predictions = model.predict_labels(
                    image,
                    model_options=model_options,
                    image_pyramid=FLAGS.image_pyramid)
            else:
                tf.logging.info('Exported model performs multi-scale inference.')
                if FLAGS.quantize_delay_step >= 0:
                    raise ValueError(
                        'Quantize mode is not supported with multi-scale test.')
                predictions = model.predict_labels_multi_scale(
                    image,
                    model_options=model_options,
                    eval_scales=FLAGS.inference_scales,
                    add_flipped_images=FLAGS.add_flipped_images)
            raw_predictions = tf.identity(
                tf.cast(predictions[common.OUTPUT_TYPE], tf.float32),
                _RAW_OUTPUT_NAME)
            raw_probabilities = tf.identity(
                predictions[common.OUTPUT_TYPE + model.PROB_SUFFIX],
                _RAW_OUTPUT_PROB_NAME)
    
            # Crop the valid regions from the predictions.
            semantic_predictions = raw_predictions[
                :, :resized_image_size[0], :resized_image_size[1]]
            semantic_probabilities = raw_probabilities[
                :, :resized_image_size[0], :resized_image_size[1]]
    
            # Resize back the prediction to the original image size.
            def _resize_label(label, label_size):
                # Expand dimension of label to [1, height, width, 1] for resize operation.
                label = tf.expand_dims(label, 3)
                resized_label = tf.image.resize_images(
                    label,
                    label_size,
                    method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
                    align_corners=True)
                return tf.cast(tf.squeeze(resized_label, 3), tf.int32)
            semantic_predictions = _resize_label(semantic_predictions, image_size)
            semantic_predictions = tf.identity(
                semantic_predictions, name=_OUTPUT_NAME)
    
            semantic_probabilities = tf.image.resize_bilinear(
                semantic_probabilities, image_size, align_corners=True,
                name=_OUTPUT_PROB_NAME)
    
            if FLAGS.quantize_delay_step >= 0:
                contrib_quantize.create_eval_graph()
    
            saver = tf.train.Saver(tf.all_variables())
    
            dirname = os.path.dirname(FLAGS.export_path)
            tf.gfile.MakeDirs(dirname)
            graph_def = tf.get_default_graph().as_graph_def(add_shapes=True)
            freeze_graph.freeze_graph_with_def_protos(
                graph_def,
                saver.as_saver_def(),
                FLAGS.checkpoint_path,
                _OUTPUT_NAME + ',' + _OUTPUT_PROB_NAME,
                restore_op_name=None,
                filename_tensor_name=None,
                output_graph=FLAGS.export_path,
                clear_devices=True,
                initializer_nodes=None)
    
            if FLAGS.save_inference_graph:
                tf.train.write_graph(graph_def, dirname, 'inference_graph.pbtxt')
    
    
    if __name__ == '__main__':
        flags.mark_flag_as_required('checkpoint_path')
        flags.mark_flag_as_required('export_path')
        tf.app.run()
    
    

    I’ve set the input_image as [1,3,512,512], because the segmentation sample seem to expect 512x512 as input dimension

  3. Move exported frozen graph to Jetson Tx2 and convert to ONNX

    python3 -m tf2onnx.convert --graphdef frozen_pascal_mbv2.pb --output pascal_mbv2.onnx --inputs ImageTensor:0 --outputs SemanticPredictions:0 --opset 11 --fold_const
    

    My current version of the tools are the following

  4. Create Engine

    /usr/src/tensorrt/bin/trtexec --onnx=pascal_mbv2.onnx --explicitBatch --saveEngine=pascal_mbv2.engine --workspace=5000
    
  5. Run deepstream app after building the sample(No modifications done for the C file when building)

    ./deepstream-segmentation-app deeplab_segmentation_config.txt sample_720.mjpeg
    

    The config file I used

    [property]
    gpu-id=0
    net-scale-factor=1.0
    model-color-format=0
    #uff-file=../../../../samples/models/Segmentation/semantic/unetres18_v4_pruned0.65_800_data.uff
    model-engine-file=pascal_mvb2.engine
    infer-dims=3;512;512
    #uff-input-order=0
    #uff-input-blob-name=data
    batch-size=2
    ## 0=FP32, 1=INT8, 2=FP16 mode
    network-mode=0
    num-detected-classes=21
    interval=0
    gie-unique-id=1
    network-type=2
    output-blob-names=final_conv/BiasAdd
    segmentation-threshold=0.0
    #parse-bbox-func-name=NvDsInferParseCustomSSD
    #custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so
    #scaling-filter=0
    #scaling-compute-hw=0
    
    [class-attrs-all]
    roi-top-offset=0
    roi-bottom-offset=0
    detected-min-w=0
    detected-min-h=0
    detected-max-w=0
    detected-max-h=0
    
    ## Per class configuration
    #[class-attrs-2]
    #threshold=0.6
    #roi-top-offset=20
    #roi-bottom-offset=10
    #detected-min-w=40
    #detected-min-h=40
    #detected-max-w=400
    #detected-max-h=800
    
    

The generated Frozen graph / ONNX file/ Checkpoint I used when exporting
Files.zip (15.1 MB)

I also took the same procedures using Pytorch, The output screen wasn’t black, but still not showing much.

  1. Export model to ONNX

    import torch
    model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True)
    pretrained=True)
    model.eval()
    
    path_to_onnx ="/content/drive/MyDrive/pytorch_deeplab/pytorch_deeplab.onnx"
    dummy = torch.randn(1, 3, 512, 512)
    torch.onnx.export(model=model, args=dummy, f=path_to_onnx, opset_version=11)
    
  2. Convert to engine file

    /usr/src/tensorrt/bin/trtexec --onnx=deeplab_pytorch.onnx --explicitBatch --saveEngine=deeplab_pytorch.engine
    

    The output for pytorch

does it indicates the model accuracy is not good?

Hi @mchi,

To be honest, I don’t know whether it’s just the model accuracy not being good, or the tensorRT engine not generated correctly in the first place…
How do you distinguish between the two…?

The way is running the model before deploying with TensorRT can get good accuracy.
If one trained the moidel, he must have have verified with the training & verify pipeline

Hi @mchi,

Thank you, I’ll try that out.

What should I do with the Tensorflow model?

There should be a model training & validate project like GitHub - hunglc007/tensorflow-yolov4-tflite: YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite for your project.
For exmaple, GitHub - hunglc007/tensorflow-yolov4-tflite: YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite is verified in this project, which indicates its accuracy is good, then the tf model can be converted to onnx and deploy on inference framework TRT

Hi @mchi,
For my tensorflow model:
I ran inference with the model before converting to ONNX and it is showing results as expected, but still shows only black screen when turned into tensorRT engine and ran on deepstream. Are there any other things I should be considering?

For my Pytorch model:
It seems that the inference speed was just too slow

Hi @Kyosuke122
Sorry for long delay! Could you go through DeepStream SDK FAQ - #21 by mchi to check the config, inference input and output to narrow down the possible reason?

Thanks!