Usage of NvBuffer APIs

Hi DaneLLL,
This example seems relevant. However, still does not seem to match my case. It seems in this example you are reading frame from camera and giving it to encoder. I am not sure you are accessing or modifying pixels of the frame.

In our case we want to read camera frame and ‘moify’ pixels before supplying it to encoder. My doubt is that modified pixes will get stuck in CPU caches unless you use a call like NvBufferMemMapSyncCpu() , specifically to flush CPU caches.

Do you think, if I follow, 10_camera_recording, and modify the frame , before it is given to encoder , I will get modified data/pixels in the frame that encoder is reading ? If so, how are caches flushed in 10_camera_recording ?

Thanks,

Hi dumbogeorge,
The samples are general and may not be specific to your case. You have to do integration.

We have APIs in tegra_multimedia_api\include\nvbuf_utils.h
[GPU]

/**
* This method must be used for getting `EGLImage` from `dmabuf-fd`.
*
* @param[in] display `EGLDisplay` object used during the creation of `EGLImage`.
* @param[in] dmabuf_fd `DMABUF FD` of buffer from which `EGLImage` to be created.
*
* @returns `EGLImageKHR` for success, `NULL` for failure
*/
EGLImageKHR NvEGLImageFromFd (EGLDisplay display, int dmabuf_fd);

/**
* This method must be used for destroying `EGLImage` object.

* @param[in] display `EGLDisplay` object used for destroying `EGLImage`.
* @param[in] eglImage `EGLImageKHR` object to be destroyed.
*
* @returns 0 for success, -1 for failure
*/
int NvDestroyEGLImage (EGLDisplay display, EGLImageKHR eglImage);

[CPU]

/**
* This method must be used for hw memory cache sync for the CPU.
* @param[in] dmabuf_fd DMABUF FD of buffer.
* @param[in] plane video frame plane.
* @param[in] pVirtAddr Virtual Addres pointer of the mem mapped plane.
*
* @returns 0 for success, -1 for failure.
*/
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

/**
* This method must be used for hw memory cache sync for device.
* @param[in] dmabuf_fd DMABUF FD of buffer.
* @param[in] plane video frame plane.
* @param[in] pVirtAddr Virtual Addres pointer of the mem mapped plane.
*
* @returns 0 for success, -1 for failure.
*/
int NvBufferMemSyncForDevice (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

/**
* This method must be used for getting mem mapped virtual Address of the plane.
* @param[in] dmabuf_fd DMABUF FD of buffer.
* @param[in] plane video frame plane.
* @param[in] memflag NvBuffer memory flag.
* @param[in] pVirtAddr Virtual Addres pointer of the mem mapped plane.
*
* @returns 0 for success, -1 for failure.
*/
int NvBufferMemMap (int dmabuf_fd, unsigned int plane, NvBufferMemFlags memflag, void **pVirtAddr);

/**
* This method must be used to Unmap the mapped virtual Address of the plane.
* @param[in] dmabuf_fd DMABUF FD of buffer.
* @param[in] plane video frame plane.
* @param[in] pVirtAddr mem mapped Virtual Addres pointer of the plane.
*
* @returns 0 for success, -1 for failure.
*/
int NvBufferMemUnMap (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

So why is it difficult for you to call these APIs to synchronize caches?

Hi dumbogeorge,
If you see issues in calling the APIs, please share patch on 10_camera_recording so that we can reproduce with onboard ov5693. For somes cases you need to synchronize caches to CPU or to Tegra HW blocks accordingly, or the data may be off-synchronized.

Hi DaneLLL,

  1. " For somes cases you need to synchronize caches to CPU or to Tegra HW blocks accordingly, or the data may be off-synchronized"

Can you please clarify whether this would apply to encoder hw block or not ?

  1. I do not have 10_camera_recording example working on my Tx2 board yet. Calling NvBuffer* APIs does not seem difficult, although I still need some help in getting them to work. I have updated my code -
https://github.com/pcgamelore/SingleCameraPlaceholder

and I have made calls to allocate buffers using (aaCamCapture.cpp) -

NvBufferMemMap(fd,0,NvBufferMem_Read_Write,m_datamemY);
        NvBufferMemMap(fd,1,NvBufferMem_Read_Write,m_datamemU);
        NvBufferMemMap(fd,2,NvBufferMem_Read_Write,m_datamemV);

and before feeding the buffer to encoder I do -

NvBufferMemSyncForCpu(framedata.nvBuffParams.dmabuf_fd,0,framedata.ydata);
    NvBufferMemSyncForCpu(framedata.nvBuffParams.dmabuf_fd,1,framedata.udata);
    NvBufferMemSyncForCpu(framedata.nvBuffParams.dmabuf_fd,2,framedata.vdata);

    NvBufferMemSyncForDevice(framedata.nvBuffParams.dmabuf_fd,0,framedata.ydata);
    NvBufferMemSyncForDevice(framedata.nvBuffParams.dmabuf_fd,1,framedata.udata);
    NvBufferMemSyncForDevice(framedata.nvBuffParams.dmabuf_fd,2,framedata.vdata);

However, I seem to fail on first call to NvBufferMemMap with a Segfault ?

Framerate set to : 30 at NvxVideoEncoderSetParameterNvMMLiteOpen : Block : BlockType = 8 
===== MSENC =====
NvMMLiteBlockCreate : Block : BlockType = 8 
nvbuf_utils: NvRmMemMap function failed... Exiting...
Segmentation fault (core dumped)

Could you spot problem with my code ? Would it be easy for you to try my code ?

Thanks

It is a bit over the line that you keep requesting us to debug your code.

We have samples to demonstrate various functions, and users have to integrate into their usecase by themselves. If there are issues, users have to give patch on one of the samples so that we can reproduce the issue and check.

So you can share a patch on 10_camera_recording and we can reproduce it with ov5693.
We don’t have enough resource to study each user’s code line by line.

Hi DaneLLL

It is indeed embarrassing to be taking so long on this. I will look to put this change in 10_camera_recording. Any idea why would I get a segfault on

NvBufferMemMap(fd,0,NvBufferMem_Read_Write,m_datamemY);

I dont have any example using this API in tegra_multimedia_api/…, neither do I have source code. So it is somewhat hard to debug. I will put this in 10_camera_recording so you can take a look.

Thanks.

Attach a sample demonstrating the APIs based on 10_camera_recording. Verified with ov5693:

10_camera_recording$ ./camera_recording -d 10 -c
10_camera_recording$ export DISPLAY=:0
10_camera_recording$ ../00_video_decode/video_decode output.h264 H264

main.cpp (20.2 KB)

Thanks DaneLLL, for helping out with modified 10_camera_recording, for my use case. I am taking your changes in my code. Will update.
Thanks,

Hi DaneLLL

I tried your code. I am not able to see figure N in the encoded image. I tried playing with font size , but that does not seem to help.

I am expecting figure N on the encoded video. Is that what you did over input image ?

Thanks,

I’d like to follow up, and ask a few more questions about the sample you’ve provided on this thread:

  1. How is the total size of the mapped buffer being established, if we wanted to copy it to an external object?
  2. The sizes provided by NvBufferParams seem to be questionable. For example, I’m reading frames from the camera, requesting 1280x720 resolution. NvBufferParams contains the following values:

1280x720 p1=1280 p2=768 p3=768 o1=0 o2=1048576 o3=1441792 s1=1048576 s2=393216 s3=393216
where the resolution comes from NvBufferParams::width/height, px is NvBufferParams::pitch, ox is NvBufferParams::offset, sx is NvBufferParams::psize

The sizes and offsets are not what I would expect from a 1280x720 frame. Can you please go a bit into details on how to copy the buffer into an external buffer with a planar YUV420 layout?

Rename title to ‘Usage of NvBuffer APIs’.

Hi alexm5m91,
I apply the print to 10_camera_recording

diff --git a/multimedia_api/ll_samples/samples/10_camera_recording/main.cpp b/multimedia_api/ll_samples/samples/10_camera_recording/main.cpp
index 369bbce..63379d2 100644
--- a/multimedia_api/ll_samples/samples/10_camera_recording/main.cpp
+++ b/multimedia_api/ll_samples/samples/10_camera_recording/main.cpp
@@ -253,9 +253,14 @@ bool ConsumerThread::threadExecute()
             ORIGINATE_ERROR("IImageNativeBuffer not supported by Image.");
         fd = iNativeBuffer->createNvBuffer(STREAM_SIZE,
                                            NvBufferColorFormat_YUV420,
-                                           NvBufferLayout_BlockLinear);
+                                           NvBufferLayout_Pitch);
         if (VERBOSE_ENABLE)
             CONSUMER_PRINT("Acquired Frame. %d\n", fd);
+        NvBufferParams par;
+        NvBufferGetParams (fd, &par);
+        CONSUMER_PRINT("Y p %d w %d h %d \n", par.pitch[0], par.width[0], par.height[0]);
+        CONSUMER_PRINT("U p %d w %d h %d \n", par.pitch[1], par.width[1], par.height[1]);
+        CONSUMER_PRINT("V p %d w %d h %d \n", par.pitch[2], par.width[2], par.height[2]);

         // Push the frame into V4L2.

and run 1280x720

./camera_recording -r 1280x720

Get the print

CONSUMER: Y p 1280 w 1280 h 720
CONSUMER: U p 768 w 640 h 360
CONSUMER: V p 768 w 640 h 360

YUV420 is widthxheightx1.5 bytes, with widthxheight Y, (width/2)x(height/2) U and (width/2)x(height/2) V.

For 1280x720, Y is 1280x720 bytes, U is 640x360 bytes, and Y is 640x360 bytes.

Hi DaneLLL,

Could you please give a line about what do you expect to appear on encoded video with your code ? Is it figure N ?

Thanks

DaneLLL, thanks, I got image acquisition working for the most part. However, the incoming image is flipped, and also it doesn’t seem like any format other than V4L2_PIX_FMT_YUV420M works. That, considering that in most tested scenarios pitch != width, forces the user to copy the frame plane by plane, and line by line within each plane, which is quite slow. Is there a way to either acquire a frame in a different color space, or in a contiguous YUV420? If not, what is the optimal way to convert the frame to, say, RGB24?

Additionally, is NvVideoConverter the fastest available API to change image orientation?

Hi dumbogeorge,

Hi alexm5m91,
Please realize that ‘pitch != width’ is HW limit. In HW design, the input and output need alignment and cannot have arbitrary values.

For format conversion, you can do it via NvVideoConverter or CUDA.
A sample for NvVideoConverter is tegra_multimedia_api\samples\07_video_convert. You can get capability via

$ ./video_convert -h

You also can do it via CUDA programming. Some examples are at
https://github.com/dusty-nv/jetson-inference/blob/master/util/cuda/cudaYUV-NV12.cu

Does NvVideoConverter run on GPU, or is it CPU-only code?

Hi,
NvVideoConverter is HW block independent of GPU and CPU.