Most efficient way to transfer a captured video frame to NvBufSurface

oviano · January 25, 2024, 10:54am

I use a Magewell Eco Capture M.2. capture card to capture video frames to mallocd memory using MWCaptureVideoFrameToVirtualAddress.

I then transfer the data to an NvBufSurface for encoding using nvenc on the Jetson.

I use NvBufSurfaceAllocate with memType NVBUF_MEM_SURFACE_ARRAY to create an NvBufSurface and use a pooling system to re-use these surfaces as required.

I use the functions NvBufSurfaceMap, memcpy (to copy the video data), NvBufSurfaceSyncForDevice and NvBufSurfaceUnMap to transfer the captured video data.

Although it works in real-time, I get high CPU usage from the memcpy.

I suspect (hope?) there is a more efficient way, such as:

capturing directly to a mapped NvBufSurface.
something else?

How are others dealing with this, what is the optimum path I should be taking?

DaneLLL · January 26, 2024, 2:07am

Hi,
If the device supports v4l2 in driver and can capture frame data through v4l2, you can try the sample:

/usr/src/jetson_multimedia_api/samples/12_v4l2_camera_cuda

If it does not support v4l2, your solution should be optimal.

oviano · January 26, 2024, 9:27am

So the problem I’m having trying to capture directly to a mapped NvBufSurface is when a surface is mapped the planes are not contiguous in memory; there is some padding between the Y and the UV plane…yet the only function I can use to capture the video frame expects a single pointer to the start of the planes…

I suppose I could allocate a surface using NVBUF_MEM_HANDLE for the capture, but how do I then efficiently transfer that to a surface allocated using NVBUF_MEM_SURFACE_ARRAY for transforming/encoding?

DaneLLL · January 26, 2024, 12:53pm

Hi,
NvBufSurface is hardware DMA buffer and there’s data alignment for each plane. Please check if you can change data layout of the source to fit the alignment.

I suppose I could allocate a surface using NVBUF_MEM_HANDLE for the capture, but how do I then efficiently transfer that to a surface allocated using NVBUF_MEM_SURFACE_ARRAY for transforming/encoding?
We don’t support this function since the surface array has to consider alignment.

oviano · January 26, 2024, 12:55pm

Thanks for confirming.

oviano · January 30, 2024, 1:42pm

At the current time it isn’t possible to change the data layout of the source; it expects the Y and UV planes to be joined as a contiguous block. I can specify the pitch, but not any gap between the planes.

The NvBufSurfaceMap() command maps the surface DMA and gives back two pointers accessible via surfaceList[0].mappedAddr.addr[0] and surfaceList[0].mappedAddr.addr[1].

I’m not really too sure how memory-mapping works, but from my experiments these two pointers can be anywhere, for example:

30/01/24 13:19:42.823791873 [default] DEBUG : VIDEO_FRAME: ----------------------------------------------------------------------
30/01/24 13:19:42.823861541 [default] DEBUG : VIDEO_FRAME: Mapped buffer surface
30/01/24 13:19:42.823896679 [default] DEBUG : VIDEO_FRAME: ----------------------------------------------------------------------
30/01/24 13:19:42.823955883 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].width = 3840
30/01/24 13:19:42.823986157 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].height = 2160
30/01/24 13:19:42.824015823 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].pitch = 7680
30/01/24 13:19:42.824059601 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].dataSize = 25034752
30/01/24 13:19:42.824093683 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.num_planes = 2
VIDEO_FRAME: buffer_surface->surfaceList[0].mappedAddr.addr[0] = 0xffff08088000
30/01/24 13:19:42.824130710 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.width[0] = 3840
30/01/24 13:19:42.824158807 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.height[0] = 2160
30/01/24 13:19:42.824184345 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.pitch[0] = 7680
30/01/24 13:19:42.824209498 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.offset[0] = 0
30/01/24 13:19:42.824235836 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.psize[0] = 16646144
VIDEO_FRAME: buffer_surface->surfaceList[0].mappedAddr.addr[1] = 0xffff18402000
30/01/24 13:19:42.824266430 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.width[1] = 1920
30/01/24 13:19:42.824293791 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.height[1] = 1080
30/01/24 13:19:42.824320801 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.pitch[1] = 7680
30/01/24 13:19:42.824347043 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.offset[1] = 16646144
30/01/24 13:19:42.824374468 [default] DEBUG : VIDEO_FRAME: buffer_surface->surfaceList[0].planeParams.psize[1] = 8388608
3

Sometimes the addr[1] pointer is lower than the addr[0] pointer, there seems to be no relationship between them.

I wonder if it is possible, somehow, to have these planes mapped contiguously such that they can be written to as one plane by the capture card driver, maybe via a different memory map command?

DaneLLL · January 31, 2024, 2:38am

Hi,
This is hard requirement in multi-plane formats and it cannot be adapted. Not sure if your source can generate single-plane formats such as YUV422(YUYV, UYVY, etc). If the source supports the format, we would suggest change to the format and try.

oviano · January 31, 2024, 7:36am

Hmm, the source can generate single plane formats but all those packed formats are 4:2:2 whereas the source is 4:2:0 and I want to end up encoding 4:2:0, so I’d be going through an undesirable conversion:

4:2:0 source → 4:2:2 capture → 4:2:0 conversion → encode

Rather than the current

4:2:0 source → 4:2:0 capture → encode

oviano · February 6, 2024, 2:59pm

It turns out that the wonderful folk at Magewell are working on some enhancements to their SDK allowing both zero-copy and multi-planar capture which should solve all of this.

If anyone is researching capture cards; Magewell’s support is really good by the way.

system · February 21, 2024, 4:14am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Encode frames using v4l2 and NvBufSurface Jetson Xavier NX encoder	8	466	June 12, 2024
How to feed the V4L2 H265 encoder NvBufSurface? Jetson AGX Xavier mmapi , encoder , video	5	675	January 30, 2024
OpenCV Mat to NvBufSurface (to use in NvBufSurfTransform) DeepStream SDK	16	3812	October 12, 2021
Convert NvBuffer to NvBufSurface using multimedia API Jetson TX2 mmapi	2	637	July 10, 2023
Is there any way to convert NvBufSurface to NvBuffer? Jetson Orin NX graphics	2	150	July 10, 2024
Convert OpenCV Mat to NvBufSurface DeepStream SDK	8	2398	March 1, 2022
How to zero-copy between the buffer of NvBufSurface and the allocation of cuMemCreate Jetson Orin NX	8	728	September 8, 2023
How to convert cv::Mat to NvBufSurface without memcpy Jetson AGX Orin opencv	6	531	March 26, 2024
How can I create new NvBufSurface from cv::cuda::GpuMat with NVBUF_MEM_SURFACE_ARRAY DeepStream SDK deepstream	15	283	May 23, 2025
Populating NvBuvSurface from cv::Mat in Jetson DeepStream SDK	2	658	October 8, 2023

Most efficient way to transfer a captured video frame to NvBufSurface

Related topics