Cascaded Instance Segmentation for defect analysis

Please provide complete information as applicable to your setup.

• Hardware Platform NVIDIA GeForce RTX 4070 Laptop GPU
• DeepStream Version 6.3 using Container based on nvcr.io/nvidia/deepstream:6.3-triton-multiarch
• TensorRT v8503
• NVIDIA GPU Driver Version 550.107.02 Cuda 12.1.105
• Issue Type: Question

I have an overall question about Deepstream SDK Pipeline architecture.
My goal is to use two YoloV8 Instance Segmentation models.
The first is used to segment objects from the background.
The second must detect defects on the previously cut out objects of a single class.

Overall running Yolov8 Instance segmentation with a tracker has worked, now i struggle to understand which steps are required to prepare the data for my second model.
The input data for the second model must be the cut out object on black background with dimensions 640x640.
Second model is only run on a single class from the primary inference.

My question is: How do i create this input data for the secondary model?

I was thinking about a structure like this:
uri_src → nvinfer(pgie: object segmentation 1x3x1280x1280) → nvtracker → nvpreprocess (is custom parsing lib needed?) → nvinfer(sgie: defect detection 1x3x640x640) → nvvidcov → nvosd → nveglglessink

This structure is already running but i got the following issues:

  • when i added the secondary inference both input shapes became 1x3x640x640
  • the visual output stays the same the masks and object ids are correctly displayed but no defects or results from secondary inference are show how can i check if the second model is even applied correctly?
  • Is the default lib for nvpreprocessing enough or is a custom lib needed?

DeepStream does not support such cut out now.

The customized object cut out should be implemented by yourself. You need a customized nvpreprocess library but not use the sample preprocess library.

Thank you for the clear answer.
As i was looking into the source code and API definitions of nvdspreprocess and the custom lib more and more questions came up. These are probably very basic but i am just getting back into c++ /Gstreamer.

  • Does the nvdspreprocess_interface.h file define which functions my custom library has to implement?
  • My library has to build upon the existing library as it must contain the Transformation functionality?
  • Overall where do i start, which functions have to be present and how can i test/debug my library in a deepstream context?
  • are there any existing examples ?

nvdspreprocess_interface.h contains the APIs for application development. gst-nvpreprocess is a DeepStream plugin. You can use it as the other plugins with GStreamer interfaces. Gst-nvdspreprocess (Alpha) — DeepStream documentation

The Gst-nvdspreprocess plugin is a customizable plugin which provides a custom library interface for preprocessing on input streams. You can customize your own library with your special algorithm on video.

The customization interfaces are Gst-nvdspreprocess (Alpha) — DeepStream documentation. You need to customize them all. You may need your own CUDA kernel to cut the segmentation objects in your customized custom_transform function and the custom_tensor_function in the sample library may be leveraged.

Thanks so far for the help.
After some time i am really getting the hang of the library and accessing meta data.

Now i have a few questions on how to really configure my nvdspreprocess plugin.
I observed some - for me - strange behavior when changing the following settings:

When setting process-on-frame = 1(as in preprocessing for primary inference), the function CustomTensorPreparation is called only once per frame.
If set to 0 the function is called for every single object, does this depend on the given batch_size?

Now my question, is the data provided to CustomTensorPreparation any different depending on this setting?
If not, i guess only one call would be enough to access the meta data and cut out all objects and combine them in a new batch buffer. So far i think this is the case since i am able to access all objects at once using the unit[i]->frame_meta->obj_meta_list.

I would like some exact guidance on how this plugin should be configured for this use case.

Attached you can see the two different config files for nvdspreprocess.
config_preprocess_secondary_sgie.txt (2.1 KB)
config_preprocess_secondary_pgie.txt (2.3 KB)

I added a simple print to see how often on which frame number the function is called, attached you can see the output for each config.
output_sgie_config.txt (12.2 KB)
output_pgie_config.txt (19.8 KB)

“process-on-frame = 1” is for PGIE.

“process-on-frame = 0” is for SGIE. the function is called for every single object.

Why did you config two nvpreprocess? Only your SGIE needs the customization.
The code is totally open source.

I am not using two nvpreprocess elements in the pipeline.
I know about SGIE and PGIE configuration, but i was testing different settings since i found it strange that the function is called multiple times per Frame.
While testing i now found out that this number of calls is related to batch size.
But how is the data inside the batch structured.
Inside which member is the actual frame and frame information contained, there are several ways to access image data but i cant figure out which one is the correct one, or what the difference is:

here is my current function using the provided batch and a buffer created using the acquirer->acquire() as in the custom lib provided by nvidia:

NvDsPreProcessStatus
CustomObjectCutoutImpl::cutout_objects(NvDsPreProcessBatch *batch, void *&devBuf)
{
    unsigned int batch_size = batch->units.size();
    NvDsPreProcessUnit *unit;
    unsigned char *inPtr;
    float *mask_buffer;
    NvDsObjectMeta *obj_meta;
    unsigned int inHeight;
    unsigned int inWidth;
    unsigned int pitch = batch->pitch;
    /* For each unit in the input batch cutout the corresponding object to the input binding buffer. */
    for (unsigned int i = 0; i < batch_size; i++)
    {
        unit = &batch->units[i];
        inPtr = (unsigned char *)unit->converted_frame_ptr;
        inWidth = unit->frame_meta->source_frame_width;
        inHeight = unit->frame_meta->source_frame_height;
        mask_buffer = unit->obj_meta->mask_params.data;
        obj_meta = unit->obj_meta;
        float *outPtr = (float *)devBuf + i * m_NetworkSize.channels * m_NetworkSize.width * m_NetworkSize.height;
        NvDsPreProcessCutoutObject(outPtr, inPtr, mask_buffer, obj_meta, inWidth, inHeight, pitch, 1.0f, *m_PreProcessStream);
    }

    return NVDSPREPROCESS_SUCCESS;
}

is this the correct data needed for cutting the object from the image?
For which image is the metadata inside the unit, the converted_frame_ptr? frame_meta? source_surface?
This is very unclear and the API docs are also not helpful.

The code is open source. You may customize according to your own requirement.

Sorry, but this does not answer my question. my Question is: How to access the SegmentationMask data inside the CustomTensorPreparation function?

I am trying to access the mask_params for each unit given by nvdspreprocess.
For each unit i tried multiple ways to access this data but it is always not initialized.

DEBUG: CustomTensorPreparation
DEBUG: Acquired buffer from tensor pool
DEBUG: Cutout objects start
DEBUG: Cutout objects batch size: 1
DEBUG: unit 0
DEBUG: obj_l_element_ptr 0x7f9fbc1fef00
DEBUG: obj_l_element_ptr->mask_params (nil)
DEBUG: obj_l_element_ptr 0x7f9fbc1feec0
DEBUG: obj_l_element_ptr->mask_params (nil)
DEBUG: obj_l_element_ptr 0x7f9fbc1feea0
DEBUG: obj_l_element_ptr->mask_params (nil)
DEBUG: obj_l_element_ptr 0x7f9fbc1fee80
DEBUG: obj_l_element_ptr->mask_params (nil)
DEBUG: obj_l_element_ptr 0x7f9fbc1fee60
DEBUG: obj_l_element_ptr->mask_params (nil)

This is the function where i try to access the meta_data:

NvDsPreProcessStatus
CustomObjectCutoutImpl::cutout_objects(NvDsPreProcessBatch *batch, void *&devBuf)
{
    printf("DEBUG: Cutout objects start\n");
    unsigned int batch_size = batch->units.size();
    NvDsPreProcessUnit *unit;
    unsigned char *inPtr;
    float *mask_buffer;
    NvDsObjectMeta *obj_meta;
    NvDsFrameMeta *frame_meta;
    NvOSD_MaskParams *mask_params;
    unsigned int inHeight;
    unsigned int inWidth;
    unsigned int size;
    unsigned int pitch = batch->pitch;
    NvDsBatchMeta *batch_meta = batch->batch_meta;
    guint64 object_id;
    printf("DEBUG: Cutout objects batch size: %d\n", batch_size);
    /* For each unit in the input batch cutout the corresponding object to the input binding buffer. */
    for (unsigned int i = 0; i < batch_size; i++)
    {
        printf("DEBUG: unit %d\n", i);
        NvDsObjectMetaList * l_obj = batch->units[i].roi_meta.frame_meta->obj_meta_list;
    for(NvDsObjectMetaList * obj_l_element_ptr = l_obj ; obj_l_element_ptr != nullptr ; obj_l_element_ptr  = obj_l_element_ptr->next)
        {
            NvDsObjectMeta * object_meta = (NvDsObjectMeta *) obj_l_element_ptr->data;
            printf("DEBUG: obj_l_element_ptr->mask_params %p\n", object_meta->mask_params);
        }
    }
    return NVDSPREPROCESS_SUCCESS;
}

For the Service Maker issue, please create new topic

Is there any way to access the original frame, as provided by the source element, inside the nvdspreprocess?
I need this because we use smaller scale for primary inference to generate mask and use masks to cut out high resolution objects from original frame for secondary inference.

The original frame is available inside nvdspreprocess library. Please refer to the source code. /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvdspreprocess/nvdspreprocess_lib

I mean the original frame from the beginning of the pipeline, provided by the main source element producing the frames input in the whole pipeline, the unscaled, unpadded data.

If you do not change the video format and resolution with “nvvideoconvert” or nvtreammux, the original frame will be available inside nvpreprocess. nvinfer and nvtracker are all “in-place” transform plugins, they never change the video.

So when connecting source → nvstreammux → nvinfer → tracker → nvdspreprocess , and no configuration for nvstreammux is made, the original should still be present ?
Sorry for asking again and again but yesterday i got quite confused trying to access the image

If you are using nvstreammux, you need to configure the “width” and “height” as your original video’s resolution. If you are using new nvstreammux, nothing should be done.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.