How to manually form a batch

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson AGX Xavier
• DeepStream Version 5.1
• JetPack Version (valid for Jetson only) 4.5.1
• TensorRT Version 7.1.3

Hello! I am trying to make a batch from 1 input buffer by cropping certain areas. I modified gstexample and my transform_ip looks like:

static GstFlowReturn
gst_batcher_transform_ip (GstBaseTransform * btrans, GstBuffer * inbuf)
{
  GstBatcher *batcher = GST_BATCHER (btrans);
  GstMapInfo in_map_info;
  GstFlowReturn flow_ret = GST_FLOW_ERROR;
  gdouble scale_ratio = 1.0;
  BatcherOutput *output;

  NvBufSurface *surface = NULL;
  guint i = 0;

  batcher->frame_num++;
  CHECK_CUDA_STATUS (cudaSetDevice (batcher->gpu_id),
      "Unable to set cuda device");

  memset (&in_map_info, 0, sizeof (in_map_info));
  if (!gst_buffer_map (inbuf, &in_map_info, GST_MAP_READ)) {
    g_print ("Error: Failed to map gst buffer\n");
    goto error;
  }

  nvds_set_input_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
  surface = (NvBufSurface *) in_map_info.data;
  GST_DEBUG_OBJECT (batcher,
      "Processing Frame %" G_GUINT64_FORMAT " Surface %p\n",
      batcher->frame_num, surface);

  if (surface->batchSize != 1) {
    GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
        ("Batch size should be 1"), (NULL));
    return GST_FLOW_ERROR;
  }

  if (CHECK_NVDS_MEMORY_AND_GPUID (batcher, surface))
    goto error;

  /* Fill the batch transform parameters. */
  while (batcher->batch_insurf.numFilled < 4) {
    //Memset the memory
    NvBufSurfaceMemSet (batcher->inter_buf, batcher->batch_insurf.numFilled, 0, 0);

    /* Create temporary src and dest surfaces for NvBufSurfTransform API. */
    batcher->batch_insurf.surfaceList[batcher->batch_insurf.numFilled] = *surface->surfaceList;

    /* Set the source ROI. Could be entire frame or an object. */
    batcher->transform_params.src_rect[batcher->batch_insurf.numFilled] = {
      (guint) batcher->rect_params[batcher->batch_insurf.numFilled]->top,
      (guint) batcher->rect_params[batcher->batch_insurf.numFilled]->left,
      (guint) batcher->rect_params[batcher->batch_insurf.numFilled]->width,
      (guint) batcher->rect_params[batcher->batch_insurf.numFilled]->height,
    };

    /* Set the dest ROI. Could be the entire destination frame or part of it to
     * maintain aspect ratio. */
    batcher->transform_params.dst_rect[batcher->batch_insurf.numFilled] = {0, 0, 300, 300};

    batcher->batch_insurf.numFilled++;
  }

  NvBufSurfTransform_Error err;
  NvBufSurfTransformConfigParams transform_config_params;

  // Configure transform session parameters for the transformation
  transform_config_params.compute_mode = NvBufSurfTransformCompute_Default;
  transform_config_params.gpu_id = batcher->gpu_id;
  transform_config_params.cuda_stream = batcher->cuda_stream;

  err = NvBufSurfTransformSetSessionParams (&transform_config_params);
  if (err != NvBufSurfTransformError_Success) {
    GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
        ("NvBufSurfTransformSetSessionParams failed with error %d", err),
        (NULL));
    return GST_FLOW_ERROR;
  }

  /* Batched tranformation. */
  err = NvBufSurfTransform (&batcher->batch_insurf, batcher->inter_buf,
      &batcher->transform_params);

  if (err != NvBufSurfTransformError_Success) {
    GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
        ("NvBufSurfTransform failed with error %d while converting buffer",
            err), (NULL));
    return GST_FLOW_ERROR;
  }

  // TODO: copy the inter_buf content to inbuf and fill the metadata here.

  batcher->batch_insurf.numFilled = 0;
  return GST_FLOW_OK;
error:

  nvds_set_output_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
  gst_buffer_unmap (inbuf, &in_map_info);
  return flow_ret;
}

Now I think I have to copy the inter_buf content to inbuf and fill the batch metadata but I am not sure how to do it. Can you please help?

If you want to replace the original batch meta, you need to release the previous batch meta from upstream and create a new batch meta for your own.

It is not recommended to create your own batch meta because all deepstream plugins rely on the batch meta to handle data. Your new batch meta may cause the other deepstream plugins failure.

The only thing we can tell you is that all batch meta related APIs are defined in /opt/nvidia/deepstream/deepstream/sources/includes/nvdsmeta.h

I am basically trying to make a batch from a single source. So there is no original batch meta and calling gst_buffer_get_nvds_batch_meta() will fail. I will look up on the header files you mentioned.

Also, how do I copy the content of inter_buf to inbuf? And if I want to do transform() instead of transform_ip(), can I allocate the output buffer in NVMM as well?

1 Like

The NVMM buffer may be HW buffer, it can not be access by CPU.
The NvBufSurface APIs NvDsBufSurface API — Deepstream Deepstream Version: 5.1 documentation

And dsexample is “in-place” mode transform plugin GstBaseTransform, you can not replace the buffer unless the caps is just the same.

Yeah I was thinking to implement transform() function instead of transform_ip(). Back to the original question, my cropped images are in the inter_buf surfaces so I need to copy that to inbuf to forward the next element in the pipeline, right? How do I do that?

The NVMM buffer may be HW buffer, it can not be access by CPU.
The NvBufSurface APIs NvDsBufSurface API — Deepstream Deepstream Version: 5.1 documentation

Hello! I have read through the documentation and implemented transform() and prepare_output_buffer() as follows:

static GstFlowReturn
gst_batcher_prepare_output_buffer(GstBaseTransform * btrans, GstBuffer * inbuf, GstBuffer ** outbuf)
{
	GstBatcher *batcher = GST_BATCHER (btrans);
	GstFlowReturn flow_ret = GST_FLOW_ERROR;	
	NvBufSurfaceCreateParams create_params;
	NvDsBatchMeta *batch_meta = NULL;
	NvDsFrameMeta *frame_meta = NULL;
	NvDsMeta *meta = NULL;

	//g_print ("Prepare start\n");	

	if (batcher->inter_buf)
		NvBufSurfaceDestroy (batcher->inter_buf);
	batcher->inter_buf = NULL;	

	/* allocate buffer for crop destination */
	create_params.gpuId  = batcher->gpu_id;
	create_params.width  = 500;
	create_params.height = 600;
	create_params.size = 0;
	create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
	create_params.layout = NVBUF_LAYOUT_PITCH;

	if(batcher->is_integrated) {
		create_params.memType = NVBUF_MEM_DEFAULT;
	} else {
		create_params.memType = NVBUF_MEM_CUDA_PINNED;
	}

	if (NvBufSurfaceCreate (&batcher->inter_buf, 4,
				&create_params) != 0) {
		GST_ERROR ("Error: Could not allocate internal buffer for batcher");
		goto error;
	}

	*outbuf = NULL;
	*outbuf = gst_buffer_new_wrapped_full (GST_MEMORY_FLAG_ZERO_PREFIXED, batcher->inter_buf,4 * sizeof(NvBufSurface), 0,4 * sizeof(NvBufSurface), NULL, NULL);    	

	// Create and fill batch meta
	batch_meta = nvds_create_batch_meta(4);    	
	if (batch_meta == NULL) {
		GST_ERROR ("Error: Could not create batch meta");
		goto error;
	}
	batch_meta->base_meta.batch_meta = batch_meta;
	batch_meta->base_meta.copy_func = nvds_batch_meta_copy_func;
	batch_meta->base_meta.release_func = nvds_batch_meta_release_func;
	batch_meta->max_frames_in_batch = 4;

	for(int i = 0;i < 4;i++) {
		frame_meta = nvds_acquire_frame_meta_from_pool(batch_meta);
		if (frame_meta == NULL) {
			GST_ERROR ("Error: Could not allocate internal buffer for batcher");
			goto error;
		}

		frame_meta->pad_index = 0;
		frame_meta->source_id = 0;
		frame_meta->buf_pts = 0;
		frame_meta->ntp_timestamp = 0;
		frame_meta->frame_num = 0;
		frame_meta->batch_id = i;
		frame_meta->source_frame_width = 500;
		frame_meta->source_frame_height = 600;
		frame_meta->num_surfaces_per_frame = 1  ;  
		nvds_add_frame_meta_to_batch(batch_meta, frame_meta);
		//g_print("%d\n", batch_meta->num_frames_in_batch);
	}
	meta = gst_buffer_add_nvds_meta (*outbuf , batch_meta, NULL, nvds_batch_meta_copy_func, nvds_batch_meta_release_func);
	meta->meta_type = NVDS_BATCH_GST_META;


	flow_ret = GST_FLOW_OK;	
	//g_print ("Prepare OK\n");	
error:	

	return flow_ret;	
}



/**
 * Called when element recieves an input buffer from upstream element.
 */
static GstFlowReturn
gst_batcher_transform (GstBaseTransform * btrans, GstBuffer *inbuf, GstBuffer *outbuf)
{
	//g_print ("Transform start\n");
	GstBatcher *batcher = GST_BATCHER (btrans);
	GstMapInfo in_map_info;
	GstMapInfo out_map_info;
	GstFlowReturn flow_ret = GST_FLOW_ERROR;
	gdouble scale_ratio = 1.0;
	BatcherOutput *output;

	NvBufSurface *surface = NULL;

	CHECK_CUDA_STATUS (cudaSetDevice (batcher->gpu_id),
			"Unable to set cuda device");

	memset (&in_map_info, 0, sizeof (in_map_info));
	if (!gst_buffer_map (inbuf, &in_map_info, GST_MAP_READ)) {
		g_print ("Error: Failed to map gst buffer\n");
		goto error;
	}

	nvds_set_input_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
	surface = (NvBufSurface *) in_map_info.data;
	GST_DEBUG_OBJECT (batcher,
			"Processing Frame %" G_GUINT64_FORMAT " Surface %p\n",
			batcher->frame_num, surface);

	if (surface->batchSize != 1) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("Batch size should be 1"), (NULL));
		goto error;
	}

	if (CHECK_NVDS_MEMORY_AND_GPUID (batcher, surface))
		goto error;

	/* Using object crops as input to the algorithm. The objects are detected by
	 * the primary detector */
	batcher->batch_insurf.numFilled = 0;
	while (batcher->batch_insurf.numFilled < 4) {
		//Memset the memory
		NvBufSurfaceMemSet (batcher->inter_buf, batcher->batch_insurf.numFilled, 0, 0);

		/* Create src surfaces for NvBufSurfTransform API. */
		batcher->batch_insurf.surfaceList[batcher->batch_insurf.numFilled] = *surface->surfaceList;

		/* Set the source ROI. Could be entire frame or an object. */
		batcher->transform_params.src_rect[batcher->batch_insurf.numFilled] = {
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].top,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].left,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].width,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].height,
		};

		/* Set the dest ROI. Could be the entire destination frame or part of it to
		 * maintain aspect ratio. */
		batcher->transform_params.dst_rect[batcher->batch_insurf.numFilled] = {0, 0, 500, 600};


		batcher->batch_insurf.numFilled++;
	}

	NvBufSurfTransform_Error err;
	NvBufSurfTransformConfigParams transform_config_params;

	// Configure transform session parameters for the transformation
	transform_config_params.compute_mode = NvBufSurfTransformCompute_Default;
	transform_config_params.gpu_id = batcher->gpu_id;
	transform_config_params.cuda_stream = batcher->cuda_stream;

	err = NvBufSurfTransformSetSessionParams (&transform_config_params);
	if (err != NvBufSurfTransformError_Success) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("NvBufSurfTransformSetSessionParams failed with error %d", err),
				(NULL));
		goto error;
	}

	/* Batched tranformation. */
	g_print("filled: %d\n",batcher->inter_buf->numFilled);
	err = NvBufSurfTransform (&batcher->batch_insurf, batcher->inter_buf,
			&batcher->transform_params);

	g_print("filled: %d\n",batcher->inter_buf->numFilled);
	if (err != NvBufSurfTransformError_Success) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("NvBufSurfTransform failed with error %d while converting buffer",
				 err), (NULL));
		goto error;
	}

	//g_print ("Transform OK\n");
	flow_ret = GST_FLOW_OK;

error:
	nvds_set_output_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
	gst_buffer_unmap (inbuf, &in_map_info);
	return flow_ret;
}

Using the NvBufSurfTransform API, it should crop the source into 4 images and output as batch. But when I tested with tiler it gives only the last cropped result and others are just blank. Can you help me find what I am missing? Thank you!

Please debug by yourself unless you find there is something wrong with deepstream APIs.

As I understood from other topics, this forum is not only about bug report. My question is not a bug report and DeepStream API is lacking a lot of documentation and a lot of them are vague in description…I would gladly appreciate if you give me any feedback on what might be the issue and I think it will help other developers as well who might have the same issue.

What do you want about the API?

I implemented my prepare_output_buffer and transform by following this topic. But I think in my case, the issue is in NvBufSurfTransform(). My src surfaces actually point to 1 surface but since its read-only I thought it will cause no issue. Can you confirm that src surfaces can be pointer to same surface?

The line in question is this:

Will the new transform plugin output multiple GstBuffers from multiple src pads?

It has 1 src pad and it will output a batch with size 4.

The batch meta should be align to your stream. You should split one stream to four streams if you modify batch meta in this way.

Using tee, nvvideoconvert, streammux will certainly work but it will cause big bottleneck so I resorted to this method.

You need to follow the batch meta mechanism. all deepstream plugins rely on the this mechanism to work. The wrong batch meta will cause other plugins fail.

I am not sure what you mean by batch meta mechanism. Is it not enough to manually add batch meta in prepare_output_buffer? Linking properly works and my pipeline runs with tiler with no error/warning but 3 of the tile is blank.