Usage of NvBuffer APIs

dumbogeorge · November 6, 2017, 6:43pm

Hi Folks,

I would like to encode output of my openCV algorithms, on Tx1. I am reading frames using Argus API and feed them to my opencv algorithms. The output frames of the algorithm are then subsequently given to Gstreamer encode pipeline via ‘appsrc’.

Output of opencv algorithms is in YUV420 format. When these output frames are given to enocder, the encoded output seems to have its U and V plane switched and overall encoded video frames (as decoded by mplayer) are very blocky.

Could someone please help, debug why the encoded output (which is legal mp4 bitstream) is very very blocky ?

My code can be found at - GitHub - pcgamelore/SingleCameraPlaceholder: Sample code the read Jetson Tx1/Tx2 cameras, encode, process the frame, and encode processed output.

Thanks,

DaneLLL · November 7, 2017, 1:53am

The input to encoder can be I420 or Nv12. Looks like you do not give input with correct format to encoder.

dumbogeorge · November 7, 2017, 5:24am

Hi DaneLLL

I am initializing encoder input format to I420.

m_pgstVideoMeta       = gst_buffer_add_video_meta_full(m_pgstBuffer,GST_VIDEO_FRAME_FLAG_NONE, GST_VIDEO_FORMAT_I420, imageWidth,imageHeight, 3, m_offset, m_stride );

Furthermore the buffers (actually the luma component ) which go in as input to encoder can be displayed correctly using imshow correctly…

//cv::imshow("img",img);

I would like to try to display all Y, U and V components of the buffer which goes in as input to encoder, in aaDebug.cpp::start_feed(). Is there a way display/render image with separate Y, U and V pointers ?

Would be great if you can try this code out.

Thanks,

DaneLLL · November 8, 2017, 8:55am

Hi dumbogeorge,
You do not use appsrc. Looks to be a bug in your code.

GstElement *videoSource = gst_element_factory_make("nveglstreamsrc", NULL);

dumbogeorge · November 8, 2017, 9:06am

Hi DaneLLL,

There are two encoders in the code. One encodes input to opencv algorithm. Another one encodes output of algorithm. The encoder code you are referring to - would encode input of opencv algorithm.

I am having issues with output encoder. Please take a look at common/aaDebug.cpp. Here the encoder uses appsrc.

m_pappsrc              = (GstAppSrc*)gst_element_factory_make("appsrc", "aa-appsrc");

The data path is like -

Camera → Queue → OCVConsumer → outputEncoderQ (in aaOCVConsumer.cpp) → This Q is popped in aaDebug.cpp::start_feed() and given to encoder.

Thanks,

DaneLLL · November 8, 2017, 9:28am

Hi dumbogeorge,
1 autovideoconvert should not be required
2 Please dump one YUV frame and check via YUV viewer( such as 7yuv http://blog.datahammer.de/ )

dumbogeorge · November 8, 2017, 6:41pm

Hi DaneLLL,

It segfaults, if we remove autovideoconvert. It probably needs to be debugged - however for now we are after getting the images correctly encoded. I have checked in code for this. Please create directory named ‘yuvframes’. All YUV images will be dumped in this directory.
I dump frames (YUV) just before feeding them to encoder. They look visually fine. After looking at them I see that encoded video is highly blocky - could something be wrong with encoder ? Would be great if you can reproduce this.

Thanks,

DaneLLL · November 9, 2017, 4:37am

Please refer to attached a.cpp demonstrating appsrc ! omxh265enc ! filesink:

$ g++ -Wall -std=c++11  a.cpp -o test $(pkg-config --cflags --libs gstreamer-app-1.0) -ldl
$ gst-launch-1.0 videotestsrc num-buffers=1 ! video/x-raw,format=I420,width=2048,height=1080 ! filesink location=a.yuv
$ ./test
$ gst-launch-1.0 filesrc location= a.h265 ! h265parse ! omxh265dec ! nvoverlaysink

Don’t see any issue with I420 generated by videotestsrc. Please compare the YUV with yours.
a.cpp (2.46 KB)

dumbogeorge · November 13, 2017, 1:40pm

Hi DaneLLL
Thanks for your code. If I were to encode frames captured from camera directly, may be using libargus, I guess I would need to -,

Modify (such that I can read the buffer parked in memory by camera, which is subsequently modified by our opencv algorithm.

launch_stream
    << "appsrc name=mysource ! "
    << "video/x-raw,width="<< w <<",height="<< h <<",framerate=30/1,format=I420 ! "
    << "omxh265enc ! video/x-h265,stream-format=byte-stream ! "
    << "filesink location=a.h265 ";

to -

launch_stream
    << "appsrc name=mysource ! "
    << "video/x-raw(memory:NVMM),width="<< w <<",height="<< h <<",framerate=30/1,format=I420 ! "
    << "omxh265enc ! video/x-h265,stream-format=byte-stream ! "
    << "filesink location=a.h265 ";

How would I run feed_function ‘on demand’ ? Rather than ever 33 ms in a loop, like you have it here -

for (int i=0; i<150; i++) {
        feed_function(nullptr);
        usleep(33333);
    }

I guess we can use ‘need-data’ signal of appsrc - is that right ?

Thanks

DaneLLL · November 14, 2017, 1:40am

Hi dumbogeorge,
For Argus → gstreamer pipeline, please refer to tegra_multimedia_api\argus\samples\gstVideoEncode

Here is also a post for reference:
[url]https://devtalk.nvidia.com/default/topic/1025961/jetson-tx2/adding-overlay-to-the-tegra-camera-api-argus-quot-gstvideoencode-quot-sample/post/5219519/#5219519[/url]

appsrc can only be CPU buffers(video/x-raw).

Not sure but appsrc should be able to run in active and passive modes. May other users share experience about it.

dumbogeorge · November 15, 2017, 7:23pm

Hi DaneLLL

Please note that I am first processing frame on CPU (i.e. accessing frames on CPU ) and then giving it to encoder via appsrc.

I am not sure, my method to map frame on CPU is right one. In code given in link above, this is how I am mapping camera from on CPU - (aaCamCapture.cpp)

char *m_datamem  = (char *)mmap(NULL, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[0]);
char *m_datamemU = (char *)mmap(NULL, fsizeU,PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[1]);
char *m_datamemV = (char *)mmap(NULL, fsizeV,PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[2]);

If I write Y in a file -

fwrite(m_datamem, sizeof(char), fize, fp)

then display image as grayscale - it comes out fine. However when I write U and V also in same file like -

fwrite(m_datamem, sizeof(char), fize, fp) 
fwrite(m_datamemU, sizeof(char), fizeU, fp) 
fwrite(m_datamemV, sizeof(char), fizeV, fp)

then display image as yuv color image, the colors do not come out well. Could there be possibility of data partially stuck in CPU caches ? In that case DDR buffer will not get updated and encoder will not pick up right data .

Do I need to be using APIs like

int NvBufferMemMap (int dmabuf_fd, unsigned int plane, NvBufferMemFlags memflag, void **pVirtAddr);
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);
int NvBufferMemUnMap (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

As you suggested in this thread -

https://devtalk.nvidia.com/default/topic/1025494/how-to-receive-csi-camera-frame-in-unified-memory-buffer/

Thanks

DaneLLL · November 16, 2017, 3:41am

Hi dumbogeorge,
I know I420/NV12 is supported in appsink from OpenCV 3.3. It only supports gray and BGR on 3.2.
Here is a post about 3.3:
[url]https://devtalk.nvidia.com/default/topic/1024245/jetson-tx2/opencv-3-3-and-integrated-camera-problems-/post/5210735/#5210735[/url]

We are not able to have experience in all cases. Other users may share their experience if any.

dumbogeorge · November 16, 2017, 5:04am

Hi DaneLLL,
I am not able to comprehend, the connection with OpenCV here.

I used

int NvBufferMemMap (int dmabuf_fd, unsigned int plane, NvBufferMemFlags memflag, void **pVirtAddr);
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);
int NvBufferMemUnMap (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

and the encode corruption is gone. It is still very blocky. I suspect it is CPU cache issue. It does not show up in your code in #8, because you only always feeding same image. If you feed video from file instead of feeding a static image - then you would see block/poor quality encode.

Could you please explain - what is difference between NvBufferMemSyncForCpu() and NvBufferMemSyncForDevice() ? Which device is being referred here ? would usage of NvBufferMemSyncForCpu() guarantee that data is not stuck in CPU caches and memory buffers given out to encoder (via appsrc) will be coherent with data in CPU caches ?

Thanks

DaneLLL · November 16, 2017, 6:29am

Hi dumbogeorge,
device meand HW blocks on TX1 such as GPU, encoders.

Description of NvBufferMemSyncForDevice():

* This should be called after CPU writes to memory and before HW access it,
 * to avoid HW getting stale data from memory. In other words, before HW
 * can take over ownership of buffer from CPU.

dumbogeorge · November 20, 2017, 1:40pm

Hi DaneLLL,

After using NvBufferMemSyncForDevice(), I am still unable to get rid of block effects in the output encoded file. I am not sure U and V buffer are properly getting flushed from CPU caches to encoder.

Would you be kind enough either to try my code in - GitHub - pcgamelore/SingleCameraPlaceholder: Sample code the read Jetson Tx1/Tx2 cameras, encode, process the frame, and encode processed output.
or feed your code given in #8, with moving images (rather than same static image) from CPU/appsrc, to see if your encoder output is proper ?

Thanks

DaneLLL · November 21, 2017, 5:27am

Hi dumbogeorge, yo don’t assign pts correctly. This may generate bitstream with incorrect bitrate.

dumbogeorge · November 21, 2017, 9:21am

Hi DaneLLL,

I am pretty sure, an encoder like MSENC, would NOT modulate its ratecontrol/bitrate based on PTS. Anyway, I have fixed PTS update, like you were doing in your code. Please check, I have a feeling that either -

encoder is getting stale data from CPU (data is still in CPU caches), when encoder reads input frames
encoder is not able to make use of pitch/offsets of the NvBuffer. I use

NvBufferParams params;
       NvBufferGetParams(fd, &params);

to get parameters of buffer given by camera, and pass those values to encoder via -

gsize          m_offset[3];
    gint           m_stride[3];
    m_offset[0]    = framedata.nvBuffParams.offset[0];
    m_offset[1]    = framedata.nvBuffParams.offset[1];
    m_offset[2]    = framedata.nvBuffParams.offset[2];
    m_stride[0]    = framedata.nvBuffParams.pitch[0]; 
    m_stride[1]    = framedata.nvBuffParams.pitch[1]; 
    m_stride[2]    = framedata.nvBuffParams.pitch[2];

int size              = imageWidth * imageHeight * 1.5;
    m_pgstBuffer          = gst_buffer_new_wrapped_full( (GstMemoryFlags)0, (gpointer)(framedata.dataY), size, 0, size, NULL, NULL );
    m_pgstVideoMeta       = gst_buffer_add_video_meta_full(m_pgstBuffer,GST_VIDEO_FRAME_FLAG_NONE, GST_VIDEO_FORMAT_I420, imageWidth,imageHeight, 3, m_offset, m_stride );

Thanks

DaneLLL · November 21, 2017, 9:26am

Hi dumbogeorge,
NvBuffer is not supported in gstreamer pipeline. You need to allocate a CPU buffer with size=widthxheightx1.5 for YUV420.

If you use NvBuffer, you have to use tegra_multimedia_api.

dumbogeorge · November 21, 2017, 9:33am

Thanks DaneLLL for quick response.

Can you please suggest how to read camera frame via argus api, (from CPU) and pass it to encoder ? My problem is camera uses some memory alignments due to with pitch != width. Also there is a gap in memory between end of Y buffer and start of U buffer. Similarly there is a gap between end of U buffer and start of V buffer.

Thanks

DaneLLL · November 21, 2017, 9:37am

hi dumbogeorge,
You should run tegra_multimedia_api\samples\10_camera_recording

Topic		Replies	Views
TX2 Camera convert/encode using Multimedia API issue Jetson TX2 camera , encoder	16	2069	October 18, 2021
Formatting images to feed into NvVideoEncoder (Tegra multimedia API) Jetson TX2	29	4746	October 18, 2021
NVIDIA Multimedia APIs with UYVY sensor Jetson TX1	31	6567	October 18, 2021
NvBufferGetParams failed Jetson AGX Xavier camera , hw , gstreamer	31	1664	December 7, 2022
Encoder error： ”Qing buffer error: Device or resource busy “ Jetson TX2	20	2932	October 18, 2021
tegra_multimedia_API:dq buffer from encoder output_plane can not completed Jetson TX2	20	4637	October 18, 2021
Encoder and opencv consumer Jetson TX1	23	5548	March 1, 2018
capture + encode on TX2 Jetson TX2	10	2129	October 18, 2021
Allocating a GstBuffer of type "memory:NVMM" Jetson AGX Xavier gstreamer	24	5806	October 18, 2021
NVMM memory Jetson TX1	36	16284	October 18, 2021

Usage of NvBuffer APIs

Related topics