How to manually form a batch

Also, how do I copy the content of inter_buf to inbuf? And if I want to do transform() instead of transform_ip(), can I allocate the output buffer in NVMM as well?

1 Like

The NVMM buffer may be HW buffer, it can not be access by CPU.
The NvBufSurface APIs NVIDIA DeepStream SDK API Reference: Main Page

And dsexample is “in-place” mode transform plugin GstBaseTransform, you can not replace the buffer unless the caps is just the same.

Yeah I was thinking to implement transform() function instead of transform_ip(). Back to the original question, my cropped images are in the inter_buf surfaces so I need to copy that to inbuf to forward the next element in the pipeline, right? How do I do that?

The NVMM buffer may be HW buffer, it can not be access by CPU.
The NvBufSurface APIs NvDsBufSurface API — Deepstream Deepstream Version: 5.1 documentation

Hello! I have read through the documentation and implemented transform() and prepare_output_buffer() as follows:

static GstFlowReturn
gst_batcher_prepare_output_buffer(GstBaseTransform * btrans, GstBuffer * inbuf, GstBuffer ** outbuf)
{
	GstBatcher *batcher = GST_BATCHER (btrans);
	GstFlowReturn flow_ret = GST_FLOW_ERROR;	
	NvBufSurfaceCreateParams create_params;
	NvDsBatchMeta *batch_meta = NULL;
	NvDsFrameMeta *frame_meta = NULL;
	NvDsMeta *meta = NULL;

	//g_print ("Prepare start\n");	

	if (batcher->inter_buf)
		NvBufSurfaceDestroy (batcher->inter_buf);
	batcher->inter_buf = NULL;	

	/* allocate buffer for crop destination */
	create_params.gpuId  = batcher->gpu_id;
	create_params.width  = 500;
	create_params.height = 600;
	create_params.size = 0;
	create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
	create_params.layout = NVBUF_LAYOUT_PITCH;

	if(batcher->is_integrated) {
		create_params.memType = NVBUF_MEM_DEFAULT;
	} else {
		create_params.memType = NVBUF_MEM_CUDA_PINNED;
	}

	if (NvBufSurfaceCreate (&batcher->inter_buf, 4,
				&create_params) != 0) {
		GST_ERROR ("Error: Could not allocate internal buffer for batcher");
		goto error;
	}

	*outbuf = NULL;
	*outbuf = gst_buffer_new_wrapped_full (GST_MEMORY_FLAG_ZERO_PREFIXED, batcher->inter_buf,4 * sizeof(NvBufSurface), 0,4 * sizeof(NvBufSurface), NULL, NULL);    	

	// Create and fill batch meta
	batch_meta = nvds_create_batch_meta(4);    	
	if (batch_meta == NULL) {
		GST_ERROR ("Error: Could not create batch meta");
		goto error;
	}
	batch_meta->base_meta.batch_meta = batch_meta;
	batch_meta->base_meta.copy_func = nvds_batch_meta_copy_func;
	batch_meta->base_meta.release_func = nvds_batch_meta_release_func;
	batch_meta->max_frames_in_batch = 4;

	for(int i = 0;i < 4;i++) {
		frame_meta = nvds_acquire_frame_meta_from_pool(batch_meta);
		if (frame_meta == NULL) {
			GST_ERROR ("Error: Could not allocate internal buffer for batcher");
			goto error;
		}

		frame_meta->pad_index = 0;
		frame_meta->source_id = 0;
		frame_meta->buf_pts = 0;
		frame_meta->ntp_timestamp = 0;
		frame_meta->frame_num = 0;
		frame_meta->batch_id = i;
		frame_meta->source_frame_width = 500;
		frame_meta->source_frame_height = 600;
		frame_meta->num_surfaces_per_frame = 1  ;  
		nvds_add_frame_meta_to_batch(batch_meta, frame_meta);
		//g_print("%d\n", batch_meta->num_frames_in_batch);
	}
	meta = gst_buffer_add_nvds_meta (*outbuf , batch_meta, NULL, nvds_batch_meta_copy_func, nvds_batch_meta_release_func);
	meta->meta_type = NVDS_BATCH_GST_META;


	flow_ret = GST_FLOW_OK;	
	//g_print ("Prepare OK\n");	
error:	

	return flow_ret;	
}



/**
 * Called when element recieves an input buffer from upstream element.
 */
static GstFlowReturn
gst_batcher_transform (GstBaseTransform * btrans, GstBuffer *inbuf, GstBuffer *outbuf)
{
	//g_print ("Transform start\n");
	GstBatcher *batcher = GST_BATCHER (btrans);
	GstMapInfo in_map_info;
	GstMapInfo out_map_info;
	GstFlowReturn flow_ret = GST_FLOW_ERROR;
	gdouble scale_ratio = 1.0;
	BatcherOutput *output;

	NvBufSurface *surface = NULL;

	CHECK_CUDA_STATUS (cudaSetDevice (batcher->gpu_id),
			"Unable to set cuda device");

	memset (&in_map_info, 0, sizeof (in_map_info));
	if (!gst_buffer_map (inbuf, &in_map_info, GST_MAP_READ)) {
		g_print ("Error: Failed to map gst buffer\n");
		goto error;
	}

	nvds_set_input_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
	surface = (NvBufSurface *) in_map_info.data;
	GST_DEBUG_OBJECT (batcher,
			"Processing Frame %" G_GUINT64_FORMAT " Surface %p\n",
			batcher->frame_num, surface);

	if (surface->batchSize != 1) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("Batch size should be 1"), (NULL));
		goto error;
	}

	if (CHECK_NVDS_MEMORY_AND_GPUID (batcher, surface))
		goto error;

	/* Using object crops as input to the algorithm. The objects are detected by
	 * the primary detector */
	batcher->batch_insurf.numFilled = 0;
	while (batcher->batch_insurf.numFilled < 4) {
		//Memset the memory
		NvBufSurfaceMemSet (batcher->inter_buf, batcher->batch_insurf.numFilled, 0, 0);

		/* Create src surfaces for NvBufSurfTransform API. */
		batcher->batch_insurf.surfaceList[batcher->batch_insurf.numFilled] = *surface->surfaceList;

		/* Set the source ROI. Could be entire frame or an object. */
		batcher->transform_params.src_rect[batcher->batch_insurf.numFilled] = {
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].top,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].left,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].width,
			(guint) batcher->rect_params[batcher->batch_insurf.numFilled].height,
		};

		/* Set the dest ROI. Could be the entire destination frame or part of it to
		 * maintain aspect ratio. */
		batcher->transform_params.dst_rect[batcher->batch_insurf.numFilled] = {0, 0, 500, 600};


		batcher->batch_insurf.numFilled++;
	}

	NvBufSurfTransform_Error err;
	NvBufSurfTransformConfigParams transform_config_params;

	// Configure transform session parameters for the transformation
	transform_config_params.compute_mode = NvBufSurfTransformCompute_Default;
	transform_config_params.gpu_id = batcher->gpu_id;
	transform_config_params.cuda_stream = batcher->cuda_stream;

	err = NvBufSurfTransformSetSessionParams (&transform_config_params);
	if (err != NvBufSurfTransformError_Success) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("NvBufSurfTransformSetSessionParams failed with error %d", err),
				(NULL));
		goto error;
	}

	/* Batched tranformation. */
	g_print("filled: %d\n",batcher->inter_buf->numFilled);
	err = NvBufSurfTransform (&batcher->batch_insurf, batcher->inter_buf,
			&batcher->transform_params);

	g_print("filled: %d\n",batcher->inter_buf->numFilled);
	if (err != NvBufSurfTransformError_Success) {
		GST_ELEMENT_ERROR (batcher, STREAM, FAILED,
				("NvBufSurfTransform failed with error %d while converting buffer",
				 err), (NULL));
		goto error;
	}

	//g_print ("Transform OK\n");
	flow_ret = GST_FLOW_OK;

error:
	nvds_set_output_system_timestamp (inbuf, GST_ELEMENT_NAME (batcher));
	gst_buffer_unmap (inbuf, &in_map_info);
	return flow_ret;
}

Using the NvBufSurfTransform API, it should crop the source into 4 images and output as batch. But when I tested with tiler it gives only the last cropped result and others are just blank. Can you help me find what I am missing? Thank you!

1 Like

Please debug by yourself unless you find there is something wrong with deepstream APIs.

As I understood from other topics, this forum is not only about bug report. My question is not a bug report and DeepStream API is lacking a lot of documentation and a lot of them are vague in description…I would gladly appreciate if you give me any feedback on what might be the issue and I think it will help other developers as well who might have the same issue.

1 Like

What do you want about the API?

I implemented my prepare_output_buffer and transform by following this topic. But I think in my case, the issue is in NvBufSurfTransform(). My src surfaces actually point to 1 surface but since its read-only I thought it will cause no issue. Can you confirm that src surfaces can be pointer to same surface?

The line in question is this:

Will the new transform plugin output multiple GstBuffers from multiple src pads?

It has 1 src pad and it will output a batch with size 4.

The batch meta should be align to your stream. You should split one stream to four streams if you modify batch meta in this way.

Using tee, nvvideoconvert, streammux will certainly work but it will cause big bottleneck so I resorted to this method.

You need to follow the batch meta mechanism. all deepstream plugins rely on the this mechanism to work. The wrong batch meta will cause other plugins fail.

I am not sure what you mean by batch meta mechanism. Is it not enough to manually add batch meta in prepare_output_buffer? Linking properly works and my pipeline runs with tiler with no error/warning but 3 of the tile is blank.

I have mentioned “The batch meta should be align to your stream. You should split one stream to four streams if you modify batch meta in this way.”

Your transform plugin should be one sink pad and four src pads plugin if you want to generate such batch meta.

Or else you need to implement the surface list like what is done inside nvstreammux.