Deepstream 4.0 : Storing NV12 frame buffers to a file from PGIE sink pad callback

SBA · August 13, 2019, 9:35am

Environment: GeForce RTX 2060 with Driver Version: 430.26 and CUDA Version: 10.2

I am looking to save the individual (decoded) NV12 frames into a separate data store and So I tried to extend the code into the deepstream_test3_app.c as explained below

** Registering callback for every frame received over PGIE sink (same as streammux src pad) **

streammux_src_pad = gst_element_get_static_pad (pgie, "sink");
  if (!streammux_src_pad)
    g_print ("Unable to get streammux sink pad\n");
  else
    gst_pad_add_probe (streammux_src_pad, GST_PAD_PROBE_TYPE_BUFFER,
        streammux_src_pad_buffer_probe, NULL, NULL);

** Inside streammux_src_pad_buffer_probe( ) called for every frame received **

static GstPadProbeReturn
streammux_src_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
...
GstMapInfo in_map_info;
NvBufSurface *surface = NULL;
GstBuffer *inbuf = gst_pad_probe_info_get_buffer(info);

memset (&in_map_info, 0, sizeof (in_map_info));
  if (!gst_buffer_map (inbuf, &in_map_info, GST_MAP_READ)) {
    g_print ("Error: Failed to map gst buffer\n");
    goto error;
  }

surface = (NvBufSurface *) in_map_info.data;
// Writing surface buffer into file
fwrite(surface->surfaceList[0].dataPtr, surface->surfaceList[0].dataSize, 1, fp1);
...

}

Then I try to input the dumped frames/files into a mjpg video using ffmpeg

ffmpeg -s:v 1920x1080 -r 20 -pix_fmt nv12 -i img_%d.raw out.mjpg

Result is, when I play out.mjpg with VLC, just a few frames are played and stopped, but its not a full video as the source video. Though I am through partially, I am assuming I may have missed to run a loop/indexing somewhere required - which is not very clear from the examples and the documentation referred.

Ref : /root/deepstream_sdk_v4.0_x86_64/sources/gst-plugins/gst-dsexample/gstdsexample.cpp
Ref : NVIDIA DeepStream SDK API Reference: NvBufSurface Struct Reference

Additional reference : /deepstream_sdk_v4.0_x86_64/sources/gst-plugins/gst-nvinfer/gstnvinfer_allocator.cpp

/* Calculate pointers to individual frame memories in the batch memory and
     * insert in the vector. */
    tmem->frame_memory_ptrs[i] = (char *) tmem->surf->surfaceList[i].dataPtr;

So can you please let me know what am I missing?

Requirement being - I am NOT supposed to do any transform of the NV12 buffer (reaching as input to PGIE) but just faithfully store/pass those decoded frames for another application usage.

bcao · August 14, 2019, 9:07am

you need to get the batch meta first and travel the batch_meta->frame_meta_list, then get the surfacelist for every frame, could you refer below code for the details?

static GstPadProbeReturn
tiler_src_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
#ifdef DUMP_JPG
    GstBuffer *buf = (GstBuffer *) info->data;
    NvDsMetaList * l_frame = NULL;
    NvDsMetaList * l_user_meta = NULL;
    NvDsUserMeta *user_meta = NULL;
    NvDsInferSegmentationMeta* seg_meta_data = NULL;
    // Get original raw data
    GstMapInfo in_map_info;
    char* src_data = NULL;
    if (!gst_buffer_map (buf, &in_map_info, GST_MAP_READ)) {
        g_print ("Error: Failed to map gst buffer\n");
        gst_buffer_unmap (buf, &in_map_info);
        return GST_PAD_PROBE_OK;
    }
    NvBufSurface *surface = (NvBufSurface *)in_map_info.data;

    NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);

    for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
      l_frame = l_frame->next) {
        NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);
        /* Validate user meta */
        for (l_user_meta = frame_meta->frame_user_meta_list; l_user_meta != NULL;
            l_user_meta = l_user_meta->next) {
            user_meta = (NvDsUserMeta *) (l_user_meta->data);
            if (user_meta && user_meta->base_meta.meta_type == NVDSINFER_SEGMENTATION_META) {
                seg_meta_data = (NvDsInferSegmentationMeta*)user_meta->user_meta_data;
            }
        }

        src_data = (char*) malloc(surface->surfaceList[frame_meta->batch_id].dataSize);
        if(src_data == NULL) {
            g_print("Error: failed to malloc src_data \n");
            continue;
        }
        cudaMemcpy((void*)src_data,
                   (void*)surface->surfaceList[frame_meta->batch_id].dataPtr,
                   surface->surfaceList[frame_meta->batch_id].dataSize,
                   cudaMemcpyDeviceToHost);
        dump_jpg(src_data,
                 surface->surfaceList[frame_meta->batch_id].width,
                 surface->surfaceList[frame_meta->batch_id].height,
                 seg_meta_data, frame_meta->source_id, frame_meta->frame_num);

        if(src_data != NULL) {
            free(src_data);
            src_data = NULL;
        }
    }
    gst_buffer_unmap (buf, &in_map_info);
#endif
    return GST_PAD_PROBE_OK;

SBA · August 14, 2019, 10:37am

Thanks for your response.

While I check the code you suggested, I have a quick question

I confirmed from my below debug code that the memory type = 3 (NVBUF_MEM_CUDA_UNIFIED) for the probe data I received on PGIE sink

NvBufSurface gpuId=0, batchSize=1, numFilled=1, isContiguous=0, memtype=3,
 surfaceList width=1920, height=1080 colorFormat=6 layout=0 dataSize=3110400 dataPtr=0x7fa2f2c00000

So I assumed cudaMemcpyDeviceToHost was NOT required (or redundant) as such memory is accessible from CPU/GPU, but your code suggests I need to do a cudaMemcpy - which was the reason I kind of directly was able to use that pointer into my fwrite

fwrite(surface->surfaceList[0].dataPtr, surface->surfaceList[0].dataSize, 1, fp1);

I am bit confused from SDK docs and your response.

bcao · August 14, 2019, 10:50am

Yes, if the memory type is NVBUF_MEM_CUDA_UNIFIED, you can access the memory directly from cpu side, I just show you how to get the raw data corresponding to the frame.

SBA · August 14, 2019, 12:02pm

Hi,

I tried the code you suggested by traversing batch_meta->frame_meta_list as shown below, but I still do not see any change in the behavior (probably because the batch-size=1 configured in dstest3_pgie_config.txt ? )

GstMapInfo in_map_info;
  NvBufSurface *surface         = NULL;
  NvDsMetaList *l_frame         = NULL;
  GstBuffer *inbuf              = gst_pad_probe_info_get_buffer(info);
  NvDsBatchMeta *batch_meta     = gst_buffer_get_nvds_batch_meta (inbuf);

  memset (&in_map_info, 0, sizeof (in_map_info));
  if (!gst_buffer_map (inbuf, &in_map_info, GST_MAP_READ)) {
    g_print ("Error: Failed to map gst buffer\n");
    goto error;
  }

  surface = (NvBufSurface *) in_map_info.data;

  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next){
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);

  // build the file name
  snprintf(file_name, 256, "img_%d.raw", frame_meta->frame_num);

  FILE *fp1 = NULL;
  fp1 = fopen(file_name, "w");

#if 0 // Seems redundant but added to check NVIDIA support response, no behavior change with this too
  char* src_data = NULL;
  src_data = (char*) malloc(surface->surfaceList[frame_meta->batch_id].dataSize);
  if (src_data != NULL) {
  cudaMemcpy((void*)src_data,
                   (void*)surface->surfaceList[frame_meta->batch_id].dataPtr,
                   surface->surfaceList[frame_meta->batch_id].dataSize,
                   cudaMemcpyDeviceToHost);
  fwrite(src_data, surface->surfaceList[frame_meta->batch_id].dataSize, 1, fp1);
  fclose(fp1); 
  free(src_data);
  }
#endif // Seems redundant but added to check NVIDIA support response, no behavior change with this too

  fwrite(surface->surfaceList[frame_meta->batch_id].dataPtr,
                  surface->surfaceList[frame_meta->batch_id].dataSize, 1, fp1);
  fclose(fp1);
 }

The example you suggested kind of refers to https://devtalk.nvidia.com/default/topic/1060782/b/t/post/5372713/ which seems to convert from NV12 to BGRA, but for my requirement I am just trying to store NV12 as is and let the upstream application decide if it wants to use it as is or want to do any transforms.

Do let me know if I am missing anything here.

bcao · August 15, 2019, 3:12am

How many sources did you add?

SBA · August 15, 2019, 5:52am

Just one source.

I am executing the deep stream test 3 app after my modifications like
./deepstream-test3-app file:///root/deepstream_sdk_v4.0_x86_64/samples/streams/sample_720p.h264

bcao · August 15, 2019, 6:13am

It should work if just one source, pls check if there is frame missing?

SBA · August 15, 2019, 6:18am

That sample program with the sample video - without my changes runs on 1441 frames and with my changes I am creating/writing/dumping the same number of files, So I believe frames are not missed.

Is there any other debug data that I can print to confirm what am I missing?

bcao · August 15, 2019, 6:31am

Maybe you can try as https://devtalk.nvidia.com/default/topic/1060782/b/t/post/5372713/ to dump this to the jgp directly to check where is wrong

SBA · August 15, 2019, 7:20am

I tried the command below and I can confirm the file stored is a valid file by opening with GIMP

gst-launch-1.0 filesrc blocksize=3110400 location=/root/image_dump/stream_0_img_1440.raw ! 
'video/x-raw,format=(string)NV12,width=(int)1920,height=(int)1080,framerate=(fraction)0/1' ! 
jpegenc ! 'image/jpeg, width=(int)1920,height=(int)1080,framerate=(fraction)0/1' ! 
filesink location=/root/image_dump/img_1440_from_h264.jpg

So may be the process of creating mjpeg video from NV12’s directly with ffmpeg is not right?
Is there a gst-launch command that you can suggest to ensure I get a mjpg valid video?
Because individual frames when converted to jpeg seems OK but the video made is playing just few frames.

bcao · August 15, 2019, 9:21am

Not quite sure, but I think you can write a simple python using opencv to do this.

SBA · August 15, 2019, 9:40am

I am able to play that video generated with “ffmpeg -s:v 1920x1080 -pix_fmt nv12 -i img_%d.raw out.mjpg” with ffplay but not with VLC - which either plays extremely jerky or just plays few frames and stops!

At-least I can confirm that my understanding of NvBufSurface interface to store the NV12 frame can be validated.

Thanks for your support to highlight that I need to read the frame_meta list to go through all the frames (serialized in frame buffers from multiple sources by streammux).

Topic		Replies	Views
Save decoded frames for each stream (getting error while mapping the mermory) DeepStream SDK	7	1095	October 12, 2021
Uridecodebin with filesink DeepStream SDK	18	6056	October 12, 2021
Access frame pointer in deepstream-app DeepStream SDK	37	12708	October 12, 2021
DeepStream3.0: How to save decoded frame in .jpg format (or OpenCV Mat) after decode DeepStream SDK	7	1103	October 12, 2021
RTSP camera access frame issue DeepStream SDK	37	10662	July 13, 2021
dps4 on xavier: failed to register EGLImage in cuda DeepStream SDK	16	2207	October 12, 2021
Deepstream保存一帧图像 DeepStream SDK tensorrt , cuda , tensorflow , gstreamer	2	1365	October 12, 2021
DeepStreamSDK API FAQ DeepStream SDK	8	933	October 12, 2021
Access video frame and use opencv to draw on it in pad probe callback DeepStream SDK	19	4103	October 12, 2021
Deepstream5 JPEG encoder DeepStream SDK	2	1049	October 12, 2021

Deepstream 4.0 : Storing NV12 frame buffers to a file from PGIE sink pad callback

Related topics