OpenCV Mat to NvBufSurface (to use in NvBufSurfTransform)

Xavier AGX, DS5 , JP44EA, TRT7, CUDA 10.2

I’m trying to convert opencv mat to NvBufSurface (to later use NvBufSurfTransform and then feed to trt engine ).

  1. How do I convert the cv::Mat to NvBufSurface of type NVBUF_MEM_SURFACE_ARRAY (assuming I have the raw data ptr of the Mat) ? I got a bit confused with NvBufSurfaceCreate , specially where to place the raw ptr, and the create_params.layout parts.
    P.S. Is NVBUF_MEM_SURFACE_ARRAY indeed the type I need to work in the Xavier hardware ? (It’s the default)

  2. How can I decode and convert a NvBufSurface to:
    – JPEG (frame and then move to host)
    – MPEG stream packet ( and then host and to be saved into a mp4/avi file etc) ?

Thanks for the help !

Hi,
Please check the code in

deepstream_sdk_v4.0.2_jetson\sources\gst-plugins\gst-dsexample\gstdsexample.cpp

The working solution is to create RGBA buffer through NvBufSurfaceCreate(), and map it to cv::Mat

  // Map the buffer so that it can be accessed by CPU
  if (NvBufSurfaceMap (dsexample->inter_buf, 0, 0, NVBUF_MAP_READ) != 0){
    goto error;
  }

  // Cache the mapped data for CPU access
  NvBufSurfaceSyncForCpu (dsexample->inter_buf, 0, 0);

  // Use openCV to remove padding and convert RGBA to BGR. Can be skipped if
  // algorithm can handle padded RGBA data.
  in_mat =
      cv::Mat (dsexample->processing_height, dsexample->processing_width,
      CV_8UC4, dsexample->inter_buf->surfaceList[0].mappedAddr.addr[0],
      dsexample->inter_buf->surfaceList[0].pitch);

After the processing is done, call NvBufSurfaceSyncForDevice() and unmap the surface.

Hi,

The sample is from device surface to host Mat. Just to make sure I understand - If I need the reverse (openCV Mat -->surface), is this the correct order :

– create the Mat and the surface
– Map the surface
– SyncCPU and get a host ptr from the surface
– memcpy from the Mat raw ptr to the host ptr. Should I use plain memcpy, or a different function from the package ?
– unmap
– Sync for GPU (is this needed ?)

Thanks for the help !

just making clear - I get:

nvbufsurface: Wrong buffer index (0)
Err in Synccpu

from running:

// ---------------------------------
NvBufSurface* src;
NvBufSurface* dst;
NvBufSurfTransformParams* params;

// read img
Mat in1 = imread(“FRAME_23.jpg”);

// Move to devicemem
NvBufSurfaceCreateParams create_params;
create_params.gpuId = 0;
create_params.width = 1024;
create_params.height = 768;
create_params.size = 0;
create_params.colorFormat = NVBUF_COLOR_FORMAT_BGR;
create_params.layout = NVBUF_LAYOUT_PITCH; // ?
create_params.memType = NVBUF_MEM_DEFAULT;

if (NvBufSurfaceCreate (&src, 1, &create_params) != 0) {
printf (“Error: in Create\n”);
}
NvBufSurfaceMemSet (src, 0, 0, 0);

if (NvBufSurfaceMap (src, 0, 0, NVBUF_MAP_READ_WRITE) != 0){
printf(“Err in Map\n”);
}
if (NvBufSurfaceSyncForCpu (src, 0, 0) !=0) {
printf(“Err in Synccpu”);
}
memcpy(src->surfaceList[0].mappedAddr.addr[0],in1.ptr(), 1024 * 768 * 3);
// ---------------------------------

Hi,
Please try to create NVBUF_COLOR_FORMAT_RGBA.
NVBUF_COLOR_FORMAT_BGR is not supported on Jetson platforms. Hardware converter engine on Jetson platforms does not support 24-bit BGR.

Hi,

Tried- same error :-( …

Is the flow I outlined correct ? I’m trying to CPU-sync a surface that is initialised and mapped, but has no content yet …

Thanks for the help !

Hi,
I don’t see any other thing suspicious in the code. Maybe you can share full test code so that we can build/run to reproduce the error.

Hi,

compile with:

g++ test.cpp -I/usr/local/cuda-10.2/include -I/ext/ds5/opt/nvidia/deepstream/deepstream-5.0/sources/includes/ -L/opt/nvidia/deepstream/deepstream-5.0/lib/ -lnvbufsurface -o test.exe

and running ./test.exe
gives:
nvbufsurface: Wrong buffer index (0)
Err in Synccpu

code:


#include “nvbufsurface.h”
#include “nvbufsurftransform.h”

int main(int argc, char** argv) {
NvBufSurface* src;
NvBufSurfTransformParams* params;

// Move to devicemem
NvBufSurfaceCreateParams create_params;
create_params.gpuId = 0;
create_params.width = 1024;
create_params.height = 768;
create_params.size = 0;
create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
create_params.layout = NVBUF_LAYOUT_PITCH; // ?
create_params.memType = NVBUF_MEM_DEFAULT;

if (NvBufSurfaceCreate (&src, 1, &create_params) != 0) {
    printf("Error: in Create \n");
}
NvBufSurfaceMemSet (src, 0, 0, 0);

if (NvBufSurfaceMap (src, 0, 0, NVBUF_MAP_READ_WRITE) != 0){
    printf("Err in Map \n");
}
if (NvBufSurfaceSyncForCpu (src, 0, 0) !=0) {
    printf("Err in Synccpu\n");
}

}

i have similar error DS 4.0

static GstFlowReturn
gst_menudraw_prepare_output_buffer(GstBaseTransform * btrans, GstBuffer * inbuf, GstBuffer ** outbuf)
{
	GstMenuDraw *menudraw = GST_MENUDRAW (btrans);
	GstFlowReturn flow_ret = GST_FLOW_ERROR;	
	GstMapInfo out_map_info;	
	NvBufSurface *newoutsurface = NULL;    
	NvBufSurface *outsurface = NULL;
	NvBufSurfaceCreateParams create_params;
	DrawParam DrawParam;
	NvDsBatchMeta *batch_meta = NULL;
	NvDsFrameMeta *frame_meta = NULL;
	NvDsMeta *meta = NULL;
   	
   
	g_print ("Prepare start\n");	
			
	/* An intermediate buffer for NV12/RGBA to BGR conversion  will be
	* required. Can be skipped if custom algorithm can work directly on NV12/RGBA. */      
	create_params.gpuId  = menudraw->gpu_id;
	create_params.width  = menudraw->processing_width;
	create_params.height = menudraw->processing_height;
	create_params.size = 0;
	create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
	create_params.layout = NVBUF_LAYOUT_PITCH;
	#ifdef __aarch64__
	create_params.memType = NVBUF_MEM_DEFAULT;
	g_print("set surface memtype nvbuf_mem_dafault\n");
	#else
	create_params.memType = NVBUF_MEM_CUDA_UNIFIED;
	#endif
	//Make GST Buffer
	if (NvBufSurfaceCreate (&newoutsurface, 1,
		  &create_params) != 0) {
	GST_ERROR ("Error: Could not allocate internal buffer for menudraw");
	goto error;
	}
	
	
	
	*outbuf = gst_buffer_new_wrapped_full (GST_MEMORY_FLAG_ZERO_PREFIXED, newoutsurface, sizeof(NvBufSurface), 0, sizeof(NvBufSurface), NULL, NULL);
	//g_print("m1\n");
	batch_meta = nvds_create_batch_meta(1);
	//g_print("m2\n");
	meta = gst_buffer_add_nvds_meta (*outbuf , batch_meta, NULL, copy_user_meta, release_user_meta);
	//g_print("m3\n");
	meta->meta_type = NVDS_BATCH_GST_META;
	batch_meta->base_meta.batch_meta = batch_meta;
	batch_meta->base_meta.copy_func = copy_user_meta;
	batch_meta->base_meta.release_func = release_user_meta;
	batch_meta->max_frames_in_batch = 1;
	frame_meta = nvds_acquire_frame_meta_from_pool(batch_meta);
	//g_print("m4\n");
	nvds_add_frame_meta_to_batch(batch_meta, frame_meta);
	//g_print("m5\n");
	// Some important parameters to fill
	frame_meta->pad_index = 0;
	frame_meta->source_id = 0;
	frame_meta->buf_pts = 0;
	frame_meta->ntp_timestamp = 0;
	frame_meta->frame_num = 0;
	frame_meta->batch_id = 0;
	frame_meta->source_frame_width = 1024;
	frame_meta->source_frame_height = 600;
	frame_meta->num_surfaces_per_frame = 1  ;  
	
	
	CHECK_CUDA_STATUS (cudaSetDevice (menudraw->gpu_id),
		"Unable to set cuda device");

	memset (&out_map_info, 0, sizeof (out_map_info));

	if (!gst_buffer_map (*outbuf, &out_map_info, GST_MAP_READWRITE)) {
	 g_print ("Error: Failed to out map gst buffer\n");		
	 goto error; 
	}	     
	outsurface = (NvBufSurface *) out_map_info.data;
	GST_DEBUG_OBJECT (menudraw,"Processing Frame Surface %p\n",outsurface);
	if (CHECK_NVDS_MEMORY_AND_GPUID (menudraw, outsurface))
	{
	   g_print ("Error: Check NVDS Memory And Gpu id\n");
	goto error;
	}
	NvBufSurfaceMemSet(outsurface, 0, 0, 0);
	
  	
	
	
		
	AddImage(&DrawParam, &Image_Arr[INDEX_MENU_BACKGROUND_CAMERA_FLIP_CAMERA_RECT], 0, 0);
	
	if (NvBufSurfaceMap (outsurface, 0, 0, NVBUF_MAP_READ_WRITE) != 0){	
		g_print("Draw surface map Error\n");
		return GST_FLOW_ERROR;
	}
	NvBufSurfaceSyncForCpu (outsurface, 0,0);	

	Process_Draw_Menu(outsurface, &DrawParam);

	NvBufSurfaceSyncForDevice (outsurface, 0,0);	
	if (NvBufSurfaceUnMap (outsurface,  0,0)){
		g_print("Draw surface unmap Error\n");
		goto error;
	}
	flow_ret = GST_FLOW_OK;
	g_print ("Prepare OK\n");
	
	//*outbuf = menudraw->outBuff;
error:	
	gst_buffer_unmap (*outbuf, &out_map_info);
	//g_print ("Prepare unmap OK\n");
		
	return flow_ret;	

}

i got a same error on API NvBufSurfaceSyncForCpu (outsurface, 0,0)

I hope can find the answer to this topic.

1 Like

Please try

if (NvBufSurfaceCreate (&src, 1, &create_params) != 0) {
    printf("Error: in Create \n");
}
// Please add this
src->numFilled = 1;

Works great , thanks !

Unrelated q (re-asking from the original post) - maybe you can refer me to some sample code / documentation:

How can I decode and convert a NvBufSurface efficiently (done in GPU/Hardware) to:
– JPEG img frame (which I will then move to host and save)
and/or
– MPEG stream packet ( and then host and to be saved into a mp4/avi file etc) ?

Thanks again for the great help !

Hi,
DeepStream SDK is based on gstreamer. For JPEG/MJPG decoding, you can use nvv4l2decoder mjpeg=1

I’m also trying to populate a NvBufSurface from a cv::Mat but for whatever reasons fail doing so.

I’m more or less copying the code you wear using here to achieve this:

NvBufSurface *inf_buf = nullptr;
NvBufSurfaceCreateParams create_params2;
create_params2.gpuId = 0;
create_params2.width = 112;
create_params2.height = 112;
create_params2.size = 0;
create_params2.colorFormat = NVBUF_COLOR_FORMAT_BGR;
create_params2.layout = NVBUF_LAYOUT_PITCH;
create_params2.memType = NVBUF_MEM_CUDA_UNIFIED;
if (NvBufSurfaceCreate(&inf_buf, 1, &create_params2) != 0) {
GST_ELEMENT_ERROR(nvinfer, STREAM, FAILED, (“Failed creating inf_buf”), (NULL));
return GST_FLOW_ERROR;
}

// cf.
// OpenCV Mat to NvBufSurface (to use in NvBufSurfTransform) - #13 by DaneLLL
inf_buf->numFilled = 1;

NvBufSurfaceMemSet(inf_buf, 0, 0, 0);

auto map_err = NvBufSurfaceMap(inf_buf, 0, 0, NVBUF_MAP_READ_WRITE);
if (map_err != 0) {
GST_ELEMENT_ERROR(nvinfer, STREAM, FAILED, (“Failed mapping buf inf_buf: %d\n”, map_err),
(NULL));
return GST_FLOW_ERROR;
}

if (NvBufSurfaceSyncForCpu(inf_buf, 0, 0) != 0) {
GST_ELEMENT_ERROR(nvinfer, STREAM, FAILED, (“Failed syncing inf_buf for CPU”), (NULL));
return GST_FLOW_ERROR;
}
memcpy(inf_buf->surfaceList[0].mappedAddr.addr[0], crop.ptr(), 112 * 112 * 3);

Unfortunately this fails for me at NvBufSurfaceSyncForCpu. When switching create_params2.memType to NVBUF_MEM_DEFAULT as you are using here NvBufSurfaceMap fails with nvbufsurface: mapping of memory type (0) not supported.

So the solution you provided does unfortunately not work in my case.

Hi @twangbarang
Please provide information about your environment:
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

• Hardware Platform: GPU
• DeepStream Version: 5.0.0
• TensorRT Version: 7.0.0.11
• NVIDIA GPU Driver Version (valid for GPU only): 460.32.03
• Issue Type: Bug?

• How to reproduce:

#include “nvbufsurface.h”
#include “nvbufsurftransform.h”
#include <opencv2/opencv.hpp>

int main(int argc, char **argv) {
NvBufSurface *src;
NvBufSurface *dst;
NvBufSurfTransformParams *params;

cv::Mat in1 = cv::imread(“yourfavorite.jpg”);

NvBufSurfaceCreateParams create_params;
create_params.gpuId = 0;
create_params.width = 1024;
create_params.height = 768;
create_params.size = 0;
create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
create_params.layout = NVBUF_LAYOUT_PITCH;
create_params.memType = NVBUF_MEM_CUDA_PINNED;

if (NvBufSurfaceCreate(&src, 1, &create_params) != 0) {
printf(“Error in create\n”);
return 1;
}
src->numFilled = 1;

NvBufSurfaceMemSet(src, 0, 0, 0);

if (NvBufSurfaceMap(src, 0, 0, NVBUF_MAP_READ_WRITE) != 0) {
printf(“Error in map\n”);
return 1;
}

if (NvBufSurfaceSyncForCpu(src, 0, 0) != 0) {
printf(“Error in synccpu\n”);
return 1;
}

return 0;
}

I tried it with different values for create_params.memType but always hit one or the other error

Hi,
NvBufSirfaceSuncForCpu() is specific to Jetson platforms. For desktop GPU, please create CUDA stream and before access the buffer, call cudaStreamSynchronize():
CUDA Runtime API :: CUDA Toolkit Documentation