Decoder_unit_sample in MMAPI, extracting nv12 data is inefficient

In the dump_raw_dmabuf() function, there is a code that uses the for loop to extract data. After testing, this code is very inefficient.

What is the efficient way to extract data from nvbuf_surf->surfaceList?
How can the memory in nvbuf_surf->surfaceList[0].mappedAddr.addr[plane] be converted into cuda device memory?

Hi,
There is data alignment in hardware DMA buffer so for dumping frame data to a file, we have to copy line by line to eliminate the additional pixels.

For getting CUDA buffer of a NvBufSurface, please refer to cuda_postprocess() in

/usr/src/jetson_multimedia_api/samples/12_camera_v4l2_cuda

hat’s what I’m doing now.

But I find it very performance-consuming and time-consuming. Is there a high-performance method to extract DMA buffer data ?

The memory type of DMA buffer is NVBUF_MEM_SURFACE_ARRAY, which cannot be used by [_device] Functions of type.

How can I convert NVBUF_MEM_SURFACE_ARRAY type memory into [_device] without copying memory data What about functions of type?

Hi,
With the following function calls:

    NvEGLImageFromFd();
    cuGraphicsEGLRegisterImage();
    cuGraphicsResourceGetMappedEglFrame();

You can get CUDA pointer to the buffer and there is no additional memory copy. It is the optimal method on Jetson platforms.

Can you provide a specific use method or example?

Hi,
It is demonstrated in HandleEGLImage(). The code of the function is in

/usr/src/jetson_multimedia_api/samples/common/algorithm/cuda

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.