How to use object detection and segmentation models in the same pipeline and get the outputs in the screen

Hello everyone, I’m building an ADAS system and I’ve got a segmentation model to detect lanes and an object detection model to detect cars and people (using TLT’s trafficcamnet model). I’d like to use both of these models in the same pipeline and get the output of both models on screen.

I’m using as a sample deepstream-test2 to use multiples models, but this sample has only object detection models in it, and I’m not understanding how to use its logic with a segmentation model.

I’m also using Gst-nvsegvisual plugin to run the inference on the segmentation model. My idea was to do something like in deepstream-test2 and have a pgie and sgie. The pgie having the object detection model attached in its config file and sgie having the segmentation model config file. With the code like this, it runs well but I can only get the mask output by nvsegvisual, and I’d like to have the original frame from the pipeline in the screen output.
I’m also accessing the mask frames with OpenCV because I want to get its output and plot it into the original frame.
I’m attaching my code and config files to help understand better what I’m currently doing.

With all this said, I’d like to know:

1 - Do I need to create a plugin to do a segmentation inference and have the output mask in the original frame?
2 - Is the Gst-nvsegvisual code accessible?
3 - I had the idea to run multiple pipelines in the same application and access its frames to plot the mask on the original frame. Is it a good idea to have multiple pipelines in the same application, considering that I’m running all this in Jetson Nano?

CMakeLists.txt (2.4 KB) dsobjectdetect_model_config.txt (3.2 KB) dstest_segmentation_config_semantic.txt (3.6 KB) main.cpp (17.9 KB)

Thanks in advance!

It needs some time to investigate your codes. Will be back to you when we understand your requirement.

Ok! Thank you very much @Fiona.Chen!

  1. nvinfer plugin already support segmentation model. The sample of deepstream-segmentation-test is to show how to integrate segmentation model to nvinfer.
    The segmentation output will be stored and transferred in pipeline as NvDsInferSegmentationMeta. It is NVDSINFER_SEGMENTATION_META type user meta.

  2. Gst-nvsegvisual is not open source. It just visualize the segmentation result(in NvDsInferSegmentationMeta). I think it does not meet your requirement.

  3. Multiple pipelines mean multiple threads. It is hard to synchronize the different pipelines. How could do know which segmentation result is for which frame in another pipeline?

You may need to implement the plot of the mask ( it is in the NvDsInferSegmentationMeta) with other method, e.g. implement a segmentation plot plugin.

Thank for your answers @Fiona.Chen ! I have some new questions about your reply.

1 - Is there any tutorial to code a new plugin?
2 - I’m having a hard time to understand how I’m going to get the NvDsInferSegmentationMeta. Should I get this metadata from the frame_meta inside osd_sink_pad_buffer_probe ? Or should I get it from a plugin like nvinfer ?
3 - I’m already trying to modify nvinfer code to get this metadata. Is it trickier to code a new plugin or modify nvinferto get this metadata ?

Thanks in advance.

  1. All deepstream plugins are gstreamer plugin. The Basics of Writing a Plugin (, deepstream SDK also provides a sample plugin in /opt/nvidia/deepstream/deepstream-5.0/sources/gst-plugins/gst-dsexample
  2. Where to get the meta depends on how you want the meta to be used. If you want to handle the meta in a plugin, it should be a new plugin but not nvinfer.
  3. nvinfer plugin is for inference only. It is not recommended to do other things with nvinfer.

Hello @Fiona.Chen , thank you for your replies.

I’m trying to access the NvDsInferSegmentationMeta in my osd_sink_pad_buffer_probe function, it is my interest to access the segmentation output in OpenCV. If you look at my code, where could I find the metadata of the mask?

With this code, I’m able to access the original pipeline frames in OpenCV. I’d like to access the segmentation output with OpenCV too, or if not possible in OpenCV some other way to access it.

static GstPadProbeReturn osd_sink_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info, gpointer u_data){

    GstBuffer *buf = (GstBuffer *) info->data;
    guint num_rects = 0;
    NvDsObjectMeta *obj_meta = NULL;
    guint vehicle_count = 0;
    guint person_count = 0;
    NvDsMetaList * l_frame = NULL;
    NvDsMetaList * l_obj = NULL;
    NvDsDisplayMeta *display_meta = NULL;

    NvDsFrameMeta *frame_meta = NULL;

    NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);

    GstMapInfo in_map_info;
    NvBufSurface *surface = NULL;

    memset (&in_map_info, 0, sizeof (in_map_info));

    if (gst_buffer_map (buf, &in_map_info, GST_MAP_READWRITE)) {
        surface = (NvBufSurface *);

        NvBufSurfaceMap(surface, -1, -1, NVBUF_MAP_READ_WRITE);
        NvBufSurfaceSyncForCpu(surface, -1, -1);

        batch_meta = gst_buffer_get_nvds_batch_meta(buf);

        for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {
            frame_meta = (NvDsFrameMeta *) (l_frame->data);
            int offset = 0;

            NvDsFrameMeta *usr_meta = (NvDsFrameMeta *) frame_meta->frame_user_meta_list->data;  // Here Im trying to access the segmentation output

            gint frame_width = (gint) surface->surfaceList[frame_meta->batch_id].width;
            gint frame_height = (gint) surface->surfaceList[frame_meta->batch_id].height;

            #ifdef PLATFORM_TEGRA
                NvBufSurfTransformRect src_rect;
                NvBufSurfTransformRect dst_rect;
                NvBufSurfTransformParams transform_params;
                NvBufSurface ip_surf;

                ip_surf = *surface;

                ip_surf.numFilled = ip_surf.batchSize = 1;
                ip_surf.surfaceList = &(surface->surfaceList[frame_meta->batch_id]);

                NvOSD_RectParams rect_params;
                rect_params.left = 0;
       = 0;
                rect_params.width = frame_width;
                rect_params.height = frame_height;

                gint src_left = GST_ROUND_UP_2((unsigned int)rect_params.left);
                gint src_top = GST_ROUND_UP_2((unsigned int);
                gint src_width = GST_ROUND_DOWN_2((unsigned int)rect_params.width);
                gint src_height = GST_ROUND_DOWN_2((unsigned int)rect_params.height);

                /* Maintain aspect ratio */
                double hdest = frame_width * src_height / (double) src_width;
                double wdest = frame_height * src_width / (double) src_height;
                guint dest_width, dest_height;

                if (hdest <= frame_height) {
                    dest_width = frame_width;
                    dest_height = hdest;
                } else {
                    dest_width = wdest;
                    dest_height = frame_height;

                src_rect = {(guint)src_top, (guint)src_left, (guint)src_width, (guint)src_height};
                dst_rect = {0, 0, (guint)dest_width, (guint)dest_height};

                if (NvBufSurfaceMapEglImage(surface, frame_meta->batch_id) != 0){
                    cout << "ERROR on converting the EGL Buffer." <inter_buf->surfaceList[0].mappedAddr.eglImage
                // Use interop APIs cuGraphicsEGLRegisterImage and
                // cuGraphicsResourceGetMappedEglFrame to access the buffer in CUDA
                static bool create_filter = true;
                static cv::Ptr< cv::cuda::Filter > filter;
                CUresult status;
                CUeglFrame eglFrame;
                CUgraphicsResource pResource = NULL;
                status = cuGraphicsEGLRegisterImage(&pResource, surface->surfaceList[0].mappedAddr.eglImage,
                status = cuGraphicsResourceGetMappedEglFrame(&eglFrame, pResource, 0, 0);
                status = cuCtxSynchronize();
                if (create_filter) {
                    filter = cv::cuda::createSobelFilter(CV_8UC4, CV_8UC4, 1, 0, 3, 1,
                    //filter = cv::cuda::createGaussianFilter(CV_8UC4, CV_8UC4, cv::Size(31,31), 0, 0, cv::BORDER_DEFAULT);
                    create_filter = false;
                cv::cuda::GpuMat d_mat(frame_height, frame_width, CV_8UC4, eglFrame.frame.pPitch[0]);
                cv::cuda::GpuMat d_mat_out;
                filter->apply (d_mat, d_mat_out);
                cv::Mat cpuDstMat;
                string filename = "cv_frames/frame_" + to_string(frame_number) + ".jpg";
                cv::imwrite(filename, cpuDstMat);
                status = cuCtxSynchronize();
                status = cuGraphicsUnregisterResource(pResource);

                // apply back to the original buffer
                transform_params.src_rect = &dst_rect;
                transform_params.dst_rect = &src_rect;
                NvBufSurfTransform (surface, &ip_surf, &transform_params);

                // Destroy the EGLImage
                NvBufSurfaceUnMapEglImage (surface, 0);

                cv::cuda::GpuMat frame;
                frame = cv::cuda::GpuMat(frame_height, frame_width, CV_8UC4,
                                         (void *) surface->surfaceList[frame_meta->batch_id].dataPtr,

                cv::Mat src_mat_BGRA;
                cv::cvtColor(src_mat_BGRA, src_mat_BGRA, COLOR_RGBA2BGR);

                string filename = "cv_frames/frame_" + to_string(frame_number) + ".jpg";
                cv::imwrite(filename, src_mat_BGRA);


            for (l_obj = frame_meta->obj_meta_list; l_obj != NULL;
                 l_obj = l_obj->next) {
                obj_meta = (NvDsObjectMeta *) (l_obj->data);
                if (obj_meta->class_id == PGIE_CLASS_ID_VEHICLE) {
                if (obj_meta->class_id == PGIE_CLASS_ID_PERSON) {

            display_meta = nvds_acquire_display_meta_from_pool(batch_meta);
            NvOSD_TextParams *txt_params = &display_meta->text_params[0];
            display_meta->num_labels = 1;
            txt_params->display_text = static_cast(g_malloc0(MAX_DISPLAY_LEN));
            offset = snprintf(txt_params->display_text, MAX_DISPLAY_LEN, "Person = %d ", person_count);
            offset = snprintf(txt_params->display_text + offset, MAX_DISPLAY_LEN, "Vehicle = %d ", vehicle_count);

            /* Now set the offsets where the string should appear */
            txt_params->x_offset = 10;
            txt_params->y_offset = 12;

            /* Font , font-color and font-size */
            txt_params->font_params.font_name = "Serif";
            txt_params->font_params.font_size = 10;
            txt_params-> = 1.0;
            txt_params-> = 1.0;
            txt_params-> = 1.0;
            txt_params->font_params.font_color.alpha = 1.0;

            /* Text background color */
            txt_params->set_bg_clr = 1;
            txt_params-> = 0.0;
            txt_params-> = 0.0;
            txt_params-> = 0.0;
            txt_params->text_bg_clr.alpha = 1.0;

            nvds_add_display_meta_to_frame(frame_meta, display_meta);

        NvBufSurfaceUnMap(surface, -1, -1);


    gst_buffer_unmap (buf, &in_map_info);

    g_print ("Frame Number = %d Number of objects = %d "
             "Vehicle Count = %d Person Count = %d\n",
             frame_number, num_rects, vehicle_count, person_count);
    return GST_PAD_PROBE_OK;

Thanks in advance!

If you have read the source codes of nvinfer plugin, you may know that the NvDsInferSegmentationMeta is attached to frame_user_meta_list in NvDsFrameMeta as a kind of NvDsUserMeta, the meta type is NVDSINFER_SEGMENTATION_META. So you can get it from NvDsUserMeta in the frame meta.

Hello @Fiona.Chen, thank you so much for the support. I was able to get the segmentation from the frame_meta and get the model output in OpenCV. I’m leaving the code here for future reference.

usr_meta = (NvDsUserMeta *) frame_meta->frame_user_meta_list->data;
if (usr_meta->base_meta.meta_type == NVDSINFER_SEGMENTATION_META) {
            usr_meta_data = (NvDsInferSegmentationMeta *) usr_meta->user_meta_data;
            int pixel_val = 0;
            int pixl_idx = 0;

            cv::Mat mask_(usr_meta_data->height, usr_meta_data->width, CV_8UC1, cv::Scalar(0, 0, 0));

            for (int ph = 0; ph < usr_meta_data->height; ph++) {
                for (int pw = 0; pw < usr_meta_data->width; pw++) {
                    pixl_idx = ph * (int) usr_meta_data->width + pw;
                    pixel_val = usr_meta_data->class_map[pixl_idx]; // this value is set to -1 or 0
                    // assuming -1 is black and 0 is white
                    if (pixel_val == 0) {
              , pw) = 255;

            string maskname = "test/test_mask_" + to_string(frame_number) + ".jpg";
            cv::resize(mask_, mask_, cv::Size(frame_meta->source_frame_width, frame_meta->source_frame_height));
            cv::imwrite(maskname, mask_);


I’ve only got one more question. I’m doing all this in the osd_sink_pad_buffer_probe method. For my final purpose that is plot that mask on the original frame. I’m using OpenCV to change the original frame with this mask, is it advisable to change this frame in this method or is it better for me to change it all in a new plugin ?
Thanks in advance.

Theoretically, both methods work for your case.

With this code, all is being done in the osd_sink_pad_buffer_probe method. I’m getting a satisfactory performance on Jetson Nano, but I’d like to improve it. With both models at the same code, I’m getting 11FPS and with only one model (Segmentation or Object Detection) I get something around 19FPS.
I’m also not being able to draw on the original frame in the pipeline. This is happening because I’m accessing an OpenCV GpuMat and there are no methods to draw lines in cv::cuda::GpuMat. With that said I’ve got two questions:

1 - What can I do to improve performance on Jetson Nano?
2 - Is there a way to change the original frame where I can use the OpenCV CPU methods to draw lines, rects, and other CPU related methods?

I’m taking a look at gst-dsexample like you said before it looks like that code alters the original frame with OpenCV CPU.

Thanks in advance.

  1. You need to find out which part impact the performance most and try to improve it. There is no general method to do the improvement.
  2. If you change the original frame, the lines, rects and others will also be the input of the inference model, is this what you want? If so, take deepstream-test2 as an example, you may access the YUV data after the hardware decoder nvv4l2decoder and draw on the frames. The link Deepstream sample code snippet - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums shows how to access NvBufsurface.
1 Like

Sorry for the late reply but this solved my problem. I was able to draw on screen and improve FPS with the code in the link you mentioned: Deepstream sample code snippet

Thank you for your support!