Mask Obtained from Deepstream are not same as TAO inferecing Output

Hey folks,

We have our python wrapper code sourced from Nvidia’s reference Python apps Having our Trained unet Etlt model int8 engine file where it we need to use pyds returned cv2 numpy array as output indexed masks but our returned array from the deepstream pipeline is in some shape of discrete color map values/probability scores not as an index to be interpreted as masks which we can use for our downstream overlay code to work. Our TAO based inferencing output for same model is an very good index mask but our deepstream is not giving and index mask as output.

We wish to have the desired mask as we get from the TAO inferencing Results. for Further use. Any help and suggestion would be appreciated for the Same.

I am enclosing my Python Code and Config_file , Desired Mask type and Obtained Mask type

#!/usr/bin/env python3

import sys

sys.path.append('../')
import gi
import math

gi.require_version('Gst', '1.0')
from gi.repository import GLib, Gst
from common.is_aarch_64 import is_aarch64
from common.bus_call import bus_call

import cv2
import pyds
import numpy as np
import os.path
from os import path

MAX_DISPLAY_LEN = 64
MUXER_OUTPUT_WIDTH = 1920
MUXER_OUTPUT_HEIGHT = 1080
MUXER_BATCH_TIMEOUT_USEC = 4000000
TILED_OUTPUT_WIDTH = 1280
TILED_OUTPUT_HEIGHT = 720
COLORS = [[128, 128, 64], [0, 0, 128], [0, 128, 128], [128, 0, 0],
          [128, 0, 128], [128, 128, 0], [0, 128, 0], [0, 0, 64],
          [0, 0, 192], [0, 128, 64], [0, 128, 192], [128, 0, 64],
          [128, 0, 192], [128, 128, 128]]


#COLORS = [[0, 0, 0],[1, 1, 1], [255, 255, 255]]
          #,[128, 0, 128], [128, 128, 0], [0, 128, 0], [0, 0, 64],
          #[0, 0, 192], [0, 128, 64], [0, 128, 192], [128, 0, 64],
          #[128, 0, 192], [255, 255, 255]]


def map_mask_as_display_bgr(mask):
    """ Assigning multiple colors as image output using the information
        contained in mask. (BGR is opencv standard.)
    """
    # getting a list of available classes
    m_list = list(set(mask.flatten()))
    print('m_list',m_list)

    shp = mask.shape
    print(np.unique(mask))
    bgr = np.zeros((shp[0], shp[1], 3))#,dtype=np.int32)
    print(np.unique(bgr))
    for idx in m_list:
        print((idx),COLORS[idx])
        bgr[mask == idx] = COLORS[idx]
        #bgr[mask == idx] = idx
    print(np.unique(bgr))
    #print(bgr)
    return bgr


def seg_src_pad_buffer_probe(pad, info, u_data):
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting is done by pyds.NvDsFrameMeta.cast()
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break
        frame_number = frame_meta.frame_num
        l_user = frame_meta.frame_user_meta_list
        while l_user is not None:
            try:
                # Note that l_user.data needs a cast to pyds.NvDsUserMeta
                # The casting is done by pyds.NvDsUserMeta.cast()
                # The casting also keeps ownership of the underlying memory
                # in the C code, so the Python garbage collector will leave
                # it alone.
                seg_user_meta = pyds.NvDsUserMeta.cast(l_user.data)
            except StopIteration:
                break
            if seg_user_meta and seg_user_meta.base_meta.meta_type == \
                    pyds.NVDSINFER_SEGMENTATION_META:
                try:
                    # Note that seg_user_meta.user_meta_data needs a cast to
                    # pyds.NvDsInferSegmentationMeta
                    # The casting is done by pyds.NvDsInferSegmentationMeta.cast()
                    # The casting also keeps ownership of the underlying memory
                    # in the C code, so the Python garbage collector will leave
                    # it alone.
                    segmeta = pyds.NvDsInferSegmentationMeta.cast(seg_user_meta.user_meta_data)
                    print(segmeta)
                except StopIteration:
                    break
                # Retrieve mask data in the numpy format from segmeta
                # Note that pyds.get_segmentation_masks() expects object of
                # type NvDsInferSegmentationMeta
                masks = pyds.get_segmentation_masks(segmeta)
                print('before',np.unique(np.array(masks)))
                masks = np.array(masks, copy=True, order='C')
                print('after',np.unique(masks))
                print(masks.shape)
                print(masks)
                # map the obtained masks to colors of 2 classes.
                frame_image = map_mask_as_display_bgr(masks)
                cv2.imwrite(folder_name + "/" + str(frame_number) + ".jpg", frame_image)
                #cv2.imwrite(folder_name + "/" + str(frame_number) + ".jpg", masks)
            try:
                l_user = l_user.next
            except StopIteration:
                break
        try:
            l_frame = l_frame.next
        except StopIteration:
            break
    return Gst.PadProbeReturn.OK


def main(args):
    # Check input arguments
    if len(args) != 4:
        sys.stderr.write("usage: %s config_file <jpeg/mjpeg file> "
                         "<path to save seg images>\n" % args[0])
        sys.exit(1)

    global folder_name
    folder_name = args[-1]
    if path.exists(folder_name):
        sys.stderr.write("The output folder %s already exists. "
                         "Please remove it first.\n" % folder_name)
        sys.exit(1)
    os.mkdir(folder_name)

    config_file = args[1]
    num_sources = len(args) - 3
    # Standard GStreamer initialization
    Gst.init(None)

    # Create gstreamer elements
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")

    # Source element for reading from the file
    print("Creating Source \n ")
    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write(" Unable to create Source \n")

    # Since the data format in the input file is jpeg,
    # we need a jpegparser
    print("Creating jpegParser \n")
    jpegparser = Gst.ElementFactory.make("jpegparse", "jpeg-parser")
    if not jpegparser:
        sys.stderr.write("Unable to create jpegparser \n")

    # Use nvdec for hardware accelerated decode on GPU
    print("Creating Decoder \n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write(" Unable to create Nvv4l2 Decoder \n")

    # Create nvstreammux instance to form batches from one or more sources.
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")

    # Create segmentation for primary inference
    seg = Gst.ElementFactory.make("nvinferbin", "primary-nvinference-engine")
    if not seg:
        sys.stderr.write("Unable to create primary inferene\n")

    # Create nvsegvisual for visualizing segmentation
    nvsegvisual = Gst.ElementFactory.make("nvsegvisual", "nvsegvisual")
    if not nvsegvisual:
        sys.stderr.write("Unable to create nvsegvisual\n")

    if is_aarch64():
        transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")

    print("Creating EGLSink \n")
    #sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
    sink = Gst.ElementFactory.make("filesink", "nvvideo-renderer")
    if not sink:
        sys.stderr.write(" Unable to create egl sink \n")

    print("Playing file %s " % args[2])
    source.set_property('location', args[2])
    if is_aarch64() and (args[2].endswith("mjpeg") or args[2].endswith("mjpg")):
        decoder.set_property('mjpeg', 1)
    streammux.set_property('width', 1920)
    streammux.set_property('height', 1080)
    streammux.set_property('batch-size', 1)
    streammux.set_property('batched-push-timeout', 4000000)
    seg.set_property('config-file-path', config_file)
    pgie_batch_size = seg.get_property("batch-size")
    if pgie_batch_size != num_sources:
        print("WARNING: Overriding infer-config batch-size", pgie_batch_size,
              " with number of sources ", num_sources,
              " \n")
        seg.set_property("batch-size", num_sources)
    nvsegvisual.set_property('batch-size', num_sources)
    nvsegvisual.set_property('width', 512)
    nvsegvisual.set_property('height', 512)
    #sink.set_property("qos", 0)
    sink.set_property("location", 'sample_out.mkv')
    print("Adding elements to Pipeline \n")
    pipeline.add(source)
    pipeline.add(jpegparser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(seg)
    pipeline.add(nvsegvisual)
    pipeline.add(sink)
    
    if is_aarch64():
        pipeline.add(transform)

    # we link the elements together
    # file-source -> jpeg-parser -> nvv4l2-decoder ->
    # nvinfer -> nvsegvisual -> sink
    print("Linking elements in the Pipeline \n")
    source.link(jpegparser)
    jpegparser.link(decoder)

    sinkpad = streammux.get_request_pad("sink_0")
    if not sinkpad:
        sys.stderr.write(" Unable to get the sink pad of streammux \n")
    srcpad = decoder.get_static_pad("src")
    if not srcpad:
        sys.stderr.write(" Unable to get source pad of decoder \n")
    srcpad.link(sinkpad)
    streammux.link(seg)
    seg.link(nvsegvisual)
    if is_aarch64():
        nvsegvisual.link(transform)
        transform.link(sink)
    else:
        nvsegvisual.link(sink)
    # create an event loop and feed gstreamer bus mesages to it
    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect("message", bus_call, loop)

    # Lets add probe to get informed of the meta data generated, we add probe to
    # the src pad of the inference element
    seg_src_pad = seg.get_static_pad("src")
    if not seg_src_pad:
        sys.stderr.write(" Unable to get src pad \n")
    else:
        seg_src_pad.add_probe(Gst.PadProbeType.BUFFER, seg_src_pad_buffer_probe, 0)

    # List the sources
    print("Now playing...")
    for i, source in enumerate(args[1:-1]):
        if i != 0:
            print(i, ": ", source)

    print("Starting pipeline \n")
    # start play back and listed to events
    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    pipeline.set_state(Gst.State.NULL)


if __name__ == '__main__':
    sys.exit(main(sys.argv))

Config_file

[property]
gpu-id=0
net-scale-factor=0.007843
# Since the model input channel is 3, using RGB color format.
model-color-format=1
offsets=127.5;127.5;127.5
labelfile-path=labels.txt
##Replace following path to your model file
model-engine-file=/home/elio_admin/full/TensorRT_model/model.isbiint8.engine
int8-calib-file=/home/elio_admin/full/TensorRT_model/model_cal.bin

#model-file=/home/elio_admin/deep/full/TensorRT_model/model.etlt

#WOrking One 
#model-engine-file=model.etlt_b1_gpu0_fp32.engine

#current DS cannot parse onnx etlt model, so you need to
#convert the etlt model to TensoRT engine first use tlt-convert

infer-dims=3;512;512

batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=3
interval=0
gie-unique-id=1
network-type=2
output-blob-names=Sigmoid
segmentation-threshold=0.0
maintain-aspect-ratio=0
segmentation-output-order=1
output-tensor-meta=1

[class-attrs-all]
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Desired Mask Array output

np.unique(tao_mask,return_counts=True)
#plt.imshow(tao_mask*50)
>>array([0, 1, 2], dtype=uint8), array([772107,  11730,   2595]))

Output Maks Array Output

np.unique(deepstream,return_counts=True)
#plt.imshow(deepstream)
>>array([ 55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,  66,  67,
         68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  80,
         81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,  92,  93,
         94,  95,  96,  97,  98,  99, 100, 101, 102, 103, 104, 105, 106,
        107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119,
        120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132,
        133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145,
        146, 147, 148, 149, 150, 151], dtype=uint8),
 array([     1,      6,     23,     57,    140,    311,    773,    560,
          2045,   3135,    873,   1814,   1080,    655,    433,    357,
           271,    245,    254,    229,    211,    185,    161,    172,
           162,    176,    154,    138,    148,    116,    104,    112,
           112,     95,    132,     93,     96,     96,    119,     89,
           113,    100,     78,     85,     96,     79,    112,     99,
           105,    113,    132,    102,    135,    140,    140,    195,
           234,    312,    368,    444,    552,    575,    721,    769,
           910,   1050,   1260,   1422,   1671,   2565,   4267,   6763,
         14294, 694327,  13904,   8103,   4137,   2662,   1680,   1229,
           876,    767,    643,    557,    398,    340,    297,    208,
           141,     83,     64,     33,     27,     13,      7,      1,
             1]))

• Hardware Platform

V100

• Deepstream Version

deepstream-app version 6.0.1
DeepStreamSDK 6.0.1
CUDA Driver Version: 11.4
CUDA Runtime Version: 11.4
TensorRT Version: 8.2
cuDNN Version: 8.3
libNVWarp360 Version: 2.0.1d3
gst-launch-1.0 version 1.20.3
GStreamer 1.20.3

• NVIDIA GPU Driver Version (valid for GPU only)

Driver Version: 470.42.01    
CUDA Version: 11.4

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

@nitinp14920914 DeepStream is not just for any specific model. We don’t know anything about your model. So you may need to check whether the nvinfer configuration is correct or not. Maybe the nvinfer debug tips will be helpful for you. DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

Can you refer to the C sample first? NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.