How to Use an Image Fusion Model in DeepStream

sultanofsleep · May 15, 2024, 6:53am

• Hardware Platform: Jetson
• DeepStream Version : 6.3
• JetPack Version : 5.1.2
• TensorRT Version : 8.5.2.2
• Issue Type : questions

My image fusion model outputs a single RGB image, and this model is utilized as a secondary GIE model. It seems that gst-nvinfer does not support image fusion models. Which plugins do I need to modify to process the model’s output?

Fiona.Chen · May 15, 2024, 9:10am

You need to customize the postprocessing. We have customization samples for the models which are not detector, classifier or segmentation. The metadata, postprocessing and pipeline should be customized. Please refer to deepstream_tao_apps/apps/tao_others at master · NVIDIA-AI-IOT/deepstream_tao_apps (github.com)

sultanofsleep · May 19, 2024, 8:18am

I have examined the examples within the deepstream_tao_apps/apps/tao_others directory, and it appears that most of them utilize a custom video-template plugin for video stream processing and leverage cvcore functions to perform computations. Would it be feasible for me to incorporate my own custom cvcore methods?Or should I create a custom postprocess plugin to process the video stream? I would like to display the model’s fused image output in real-time on the sink.

Fiona.Chen · May 20, 2024, 1:30am

nvvideotemplate is open source, you can customized your own library as you like.

It depends on your model. For different model, the customization may be different.

Please describe the model input layers and output layers in details so that we can give proper suggestions.

sultanofsleep · May 20, 2024, 6:28am

“I have a model with two 1x3x640x640 inputs (which I’ve modified from ONNX to combine into a 2x3x640x640 input), and the output is 1x3x640x640. That is, it takes two images as input and produces one image as output. I am now preparing to add an NvDsPostProcessFusionOutput type in the postprocess plugin to save the model’s output image. Could you tell me specifically which parts I need to modify to output the image to the sink?”

Fiona.Chen · May 20, 2024, 8:37am

What kind of sink do you want? Display sink? Message broker? File sink?..

sultanofsleep · May 20, 2024, 8:40am

Display sink

Fiona.Chen · May 20, 2024, 8:51am

The proposed pipeline may look like:

source → nvstreammux → nvdspreprocess → nvinfer → nvvideoconvert → nvvideotemplate → nvmultistreamtiler → nvdsosd → nv3dsink

The nvdspreprocess library should be customized to generate the 2x3x640x640 tensor as the model needed. The nvvideoconvert helps you to scale the video resolution to the resolution you want to display and convert the video format to RGBA. In the nvvideotemplate, you should customize the following functions:

get the output 1x3x640x640 tensor and convert it to RGB data
scale the RGB data to the resolution you want to display
you need to replace the RGB data inside NVBufSurface with the RGB data you generated.

sultanofsleep · May 23, 2024, 2:19pm

I have created the following pipeline where primary-gie is the detection model and secondary-gie is the fusion model. Both models take as input a combined 2x3x640x640 from two video streams. BTW, I have some questions

（1）My detection model works properly, but my fusion model encounters errors when used as secondary-gie.

（2）I have set the output-tensor-meta=1 for the fusion model to add the output tensor to the frame_user_meta_list. What method should I use to convert the tensor data into RGB data?

Fiona.Chen · May 24, 2024, 8:46am

Please provide the configuration file of the SGIE

As you said before, the output tensor dimension is 1x3x640x640，please ask the guy who provide the model to you for how to get the R, G, B data from such tensor.

In our samples, there is already sample for how to read the output tensor buffer. You need to customize according to your model.

sultanofsleep · May 24, 2024, 9:09am

This is the configuration file for SGIE.

In fact, the output content of this 1x3x640x640 is an image in RGB format. I use a pointer of type NvDsInferTensorMeta to obtain the model output data, but I am not quite clear whether NvDsInferTensorMeta can be directly converted into NVBufSurface.

Fiona.Chen · May 27, 2024, 5:53am

No. You can’t.

Please consult the guy who provide the model to you for the details for the output tensor data content and format.

Please refer to NVIDIA DeepStream SDK API Reference: NvBufSurface Struct Reference | NVIDIA Docs for the NvBufSurface. There is also sample for how to get the raw data inside NvBufSurface. DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

sultanofsleep · June 5, 2024, 5:12am

I referred to the code of Deepstream sample code snippet - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums, and I converted the output of nvinfer into a 768x1024 cv::Mat. I tried to place this image in the lower left corner of the nvbufsurface, but encountered some data conversion issues. Below is my code.

static GstPadProbeReturn
tiler_src_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
  GstBuffer *buf = (GstBuffer *) info->data;
  guint num_rects = 0; 
  NvDsObjectMeta *obj_meta = NULL;
  guint vehicle_count = 0;
  guint person_count = 0;
  NvDsMetaList * l_frame = NULL;
  NvDsMetaList * l_obj = NULL;
  NvDsDisplayMeta *display_meta = NULL;

  NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);


  // Get original raw data
  GstMapInfo in_map_info;
  if (!gst_buffer_map (buf, &in_map_info, GST_MAP_READ)) {
      g_print ("Error: Failed to map gst buffer\n");
      gst_buffer_unmap (buf, &in_map_info);
      return GST_PAD_PROBE_OK;
  }
  NvBufSurface *surface = (NvBufSurface *)in_map_info.data;

  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
    l_frame = l_frame->next) {
      NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);

    NvDsUserMetaList *usrMetaList = frame_meta->frame_user_meta_list;
    if (usrMetaList != NULL) {
      NvDsUserMeta *usrMetaData = (NvDsUserMeta *) usrMetaList->data;

      if(usrMetaData->base_meta.meta_type == NVDSINFER_TENSOR_OUTPUT_META){
          // NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);
          //TODO for cuda device memory we need to use cudamemcpy
          NvBufSurfaceMap (surface, -1, -1, NVBUF_MAP_READ);
          /* Cache the mapped data for CPU access */
          NvBufSurfaceSyncForCpu (surface, 0, 0); //will do nothing for unified memory type on dGPU
          guint surface_height = surface->surfaceList[frame_meta->batch_id].height;
          guint surface_width = surface->surfaceList[frame_meta->batch_id].width;

          //Create Mat from NvMM memory, refer opencv API for how to create a Mat
          cv::Mat nv12_mat = cv::Mat(surface_height*3/2, surface_width, CV_8UC1, surface->surfaceList[frame_meta->batch_id].mappedAddr.addr[0],
          surface->surfaceList[frame_meta->batch_id].pitch);

          NvDsInferTensorMeta *meta = (NvDsInferTensorMeta *) usrMetaData->user_meta_data;
          for (unsigned int i = 0; i < meta->num_output_layers; i++) {
            NvDsInferLayerInfo *info = &meta->output_layers_info[i];
            info->buffer = meta->out_buf_ptrs_host[i];
            if (meta->out_buf_ptrs_dev[i]) {
              cudaMemcpy (meta->out_buf_ptrs_host[i], meta->out_buf_ptrs_dev[i],
                  info->inferDims.numElements * 4, cudaMemcpyDeviceToHost);
            }
          }

          //Create image from NVDSINFER_TENSOR_OUTPUT_META
          int ch = meta->output_layers_info->inferDims.d[0];
          int fusion_height = meta->output_layers_info->inferDims.d[1];
          int fusion_width = meta->output_layers_info->inferDims.d[2];
          int o_count = meta->output_layers_info->inferDims.numElements;
          int onechannel_size = fusion_height * fusion_width;
          float *outputCoverageBuffer =(float *) meta->output_layers_info[0].buffer;
          cv::Mat fusion_mat; 
          using image_type = uint8_t;
          int image_format = CV_8UC1;
          image_type* uint8Buffer = (image_type *)malloc(o_count * sizeof(image_type));
          image_type* uint8Buffer_C1 = (image_type *)malloc(onechannel_size * sizeof(image_type));
          image_type* uint8Buffer_C2 = (image_type *)malloc(onechannel_size * sizeof(image_type));
          image_type* uint8Buffer_C3 = (image_type *)malloc(onechannel_size * sizeof(image_type));
          
          for(int o_index=0; o_index < o_count; o_index++){
            uint8Buffer[o_index] = static_cast<uint8_t>(std::min(std::max(outputCoverageBuffer[o_index] * 255.0f, 0.0f), 255.0f));
          }

          for(int o_index=0; o_index < onechannel_size; o_index++){
            uint8Buffer_C1[o_index] = uint8Buffer[o_index];
            uint8Buffer_C2[o_index] = uint8Buffer[o_index + onechannel_size];
            uint8Buffer_C3[o_index] = uint8Buffer[o_index + 2 * onechannel_size];

          }

          std::vector<cv::Mat> channels;
          for(int idx=2;idx>=0;idx--){
            cv::Mat dumpimg;
            if (idx == 0) dumpimg = cv::Mat(fusion_height, fusion_width, image_format, uint8Buffer_C1);
            else if (idx == 1) dumpimg = cv::Mat(fusion_height, fusion_width, image_format, uint8Buffer_C2);
            else dumpimg = cv::Mat(fusion_height, fusion_width, image_format, uint8Buffer_C3);
            channels.emplace_back(dumpimg);
          }
          cv::merge(channels, fusion_mat);
          
            NvBufSurface *inter_buf = nullptr;
            NvBufSurfaceCreateParams create_params;
            create_params.gpuId  = surface->gpuId;
            create_params.width  = surface_height*3/2;
            create_params.height = surface_width;
            create_params.size = 0;
            create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
            create_params.layout = NVBUF_LAYOUT_PITCH;
          #ifdef __aarch64__
            create_params.memType = NVBUF_MEM_DEFAULT;
          #else
            create_params.memType = NVBUF_MEM_CUDA_UNIFIED;
          #endif
            //Create another scratch RGBA NvBufSurface
            if (NvBufSurfaceCreate (&inter_buf, 1,
              &create_params) != 0) {
              GST_ERROR ("Error: Could not allocate internal buffer ");
              return GST_PAD_PROBE_OK;
            }
            if(NvBufSurfaceMap (inter_buf, 0, -1, NVBUF_MAP_READ_WRITE) != 0)
              std::cout << "map error" << std::endl;
            NvBufSurfaceSyncForCpu (inter_buf, 0, 0);
            cv::Mat trans_mat = cv::Mat(surface_height*3/2, surface_width, CV_8UC4, inter_buf->surfaceList[frame_meta->batch_id].mappedAddr.addr[0],
          inter_buf->surfaceList[0].pitch);
          nv12_mat.copyTo(trans_mat);

          cv::Mat dstROI = trans_mat(cv::Rect(0, fusion_height, fusion_mat.cols, fusion_mat.rows));
          cv::cvtColor(fusion_mat, fusion_mat, cv::COLOR_BGRA2RGBA);
          // 将源矩阵复制到目标矩阵的ROI区域
          fusion_mat.copyTo(dstROI);
          char file_name[128];
          sprintf(file_name, "fusion_stream%2d_%03d_2.png", frame_meta->source_id, frame_number);
          cv::imwrite(file_name, fusion_mat);


          NvBufSurfaceSyncForDevice(inter_buf, 0, 0);
          inter_buf->numFilled = 1;
          NvBufSurfTransformConfigParams transform_config_params;
          NvBufSurfTransformParams transform_params;
          NvBufSurfTransformRect src_rect;
          NvBufSurfTransformRect dst_rect;
          cudaStream_t cuda_stream;
          CHECK_CUDA_STATUS (cudaStreamCreate (&cuda_stream),
            "Could not create cuda stream");
          transform_config_params.compute_mode = NvBufSurfTransformCompute_Default;
          transform_config_params.gpu_id = surface->gpuId;
          transform_config_params.cuda_stream = cuda_stream;
          /* Set the transform session parameters for the conversions executed in this
            * thread. */
          NvBufSurfTransform_Error err = NvBufSurfTransformSetSessionParams (&transform_config_params);
          if (err != NvBufSurfTransformError_Success) {
            std::cout <<"NvBufSurfTransformSetSessionParams failed with error "<< err << std::endl;
            return GST_PAD_PROBE_OK;
          }
          /* Set the transform ROIs for source and destination, only do the color format conversion*/
          src_rect = {0, 0, surface_height*3/2, surface_width};
          dst_rect = {0, 0, surface_height*3/2, surface_width};

          /* Set the transform parameters */
          transform_params.src_rect = &src_rect;
          transform_params.dst_rect = &dst_rect;
          transform_params.transform_flag =
            NVBUFSURF_TRANSFORM_FILTER | NVBUFSURF_TRANSFORM_CROP_SRC |
              NVBUFSURF_TRANSFORM_CROP_DST;
          transform_params.transform_filter = NvBufSurfTransformInter_Default;

          /* Transformation format conversion, Transform rotated RGBA mat to NV12 memory in original input surface*/
          err = NvBufSurfTransform (inter_buf, surface, &transform_params);
          if (err != NvBufSurfTransformError_Success) {
            std::cout<< "NvBufSurfTransform failed with error %d while converting buffer" << err <<std::endl;
            return GST_PAD_PROBE_OK;
          }
          // nvds_copy_obj_meta();
          NvBufSurfaceUnMap(inter_buf, 0, 0);
        }
    }
    NvBufSurfaceUnMap(surface, 0, 0);

  }
  frame_number++;
  return GST_PAD_PROBE_OK;
}

NvBufSurfaceSyncForDevice(inter_buf, 0, 0) and NvBufSurfaceSyncForCpu(inter_buf, 0, 0) always return -1 and the error nvbufsurface: Wrong buffer index (0). My model output can be the Y channel of YUV or an RGB image; in this code, I am using a model that outputs an RGB image, and the model’s output is correct. The data in the nv12 format of nvbufsurface mapped to cv::Mat only includes one channel, while the model output is saved in the format of a three-channel RGB image. Is it feasible to directly use OpenCV for format conversion? Or should I use a model that outputs the Y channel?

Fiona.Chen · June 5, 2024, 5:21am

Where and how did you apply such operation?

sultanofsleep · June 5, 2024, 6:38am

I have updated the code from my previous response. I implemented the operation by adding a probe in the src pad of the nvmultistreamtiler element.

Topic		Replies	Views
OpenCV Mat to NvBufSurface (to use in NvBufSurfTransform) DeepStream SDK	16	3550	October 12, 2021
Cannot find the objectDetector_FastRCNN example DeepStream SDK deepstream	46	165	October 14, 2024
Adding Preprocessing to Frames RTSP DeepStream SDK	18	3358	October 12, 2021
Package the tensorRT inference results into GstBuffer and push them to the downstream DeepStream SDK	7	1303	October 17, 2022
Request for Assistance in Improving Output Render Quality DeepStream SDK jetson , deepstream	10	29	December 17, 2024
How to NvBufSurface to cv::Mat? DeepStream SDK	12	450	July 2, 2024
From nvBufSurface to gpu::Mat or cuda stream DeepStream SDK cuda , deepstream	15	52	February 6, 2025
How to custom preprocess in SGIE base on Deepstream 5.0? DeepStream SDK	15	3322	October 12, 2021
How to switch between different video sources and zoom in to full screen on Sink DeepStream SDK deepstream	13	91	November 6, 2024
Deepstream DeepStream SDK gstreamer , jetson , deepstream	14	446	July 9, 2024

How to Use an Image Fusion Model in DeepStream

Related topics