I use a Magewell Eco Capture M.2. capture card to capture video frames to mallocd memory using MWCaptureVideoFrameToVirtualAddress.
I then transfer the data to an NvBufSurface for encoding using nvenc on the Jetson.
I use NvBufSurfaceAllocate with memType NVBUF_MEM_SURFACE_ARRAY to create an NvBufSurface and use a pooling system to re-use these surfaces as required.
I use the functions NvBufSurfaceMap, memcpy (to copy the video data), NvBufSurfaceSyncForDevice and NvBufSurfaceUnMap to transfer the captured video data.
Although it works in real-time, I get high CPU usage from the memcpy.
I suspect (hope?) there is a more efficient way, such as:
capturing directly to a mapped NvBufSurface.
something else?
How are others dealing with this, what is the optimum path I should be taking?
So the problem I’m having trying to capture directly to a mapped NvBufSurface is when a surface is mapped the planes are not contiguous in memory; there is some padding between the Y and the UV plane…yet the only function I can use to capture the video frame expects a single pointer to the start of the planes…
I suppose I could allocate a surface using NVBUF_MEM_HANDLE for the capture, but how do I then efficiently transfer that to a surface allocated using NVBUF_MEM_SURFACE_ARRAY for transforming/encoding?
Hi,
NvBufSurface is hardware DMA buffer and there’s data alignment for each plane. Please check if you can change data layout of the source to fit the alignment.
I suppose I could allocate a surface using NVBUF_MEM_HANDLE for the capture, but how do I then efficiently transfer that to a surface allocated using NVBUF_MEM_SURFACE_ARRAY for transforming/encoding?
We don’t support this function since the surface array has to consider alignment.
At the current time it isn’t possible to change the data layout of the source; it expects the Y and UV planes to be joined as a contiguous block. I can specify the pitch, but not any gap between the planes.
The NvBufSurfaceMap() command maps the surface DMA and gives back two pointers accessible via surfaceList[0].mappedAddr.addr[0] and surfaceList[0].mappedAddr.addr[1].
I’m not really too sure how memory-mapping works, but from my experiments these two pointers can be anywhere, for example:
Sometimes the addr[1] pointer is lower than the addr[0] pointer, there seems to be no relationship between them.
I wonder if it is possible, somehow, to have these planes mapped contiguously such that they can be written to as one plane by the capture card driver, maybe via a different memory map command?
Hi,
This is hard requirement in multi-plane formats and it cannot be adapted. Not sure if your source can generate single-plane formats such as YUV422(YUYV, UYVY, etc). If the source supports the format, we would suggest change to the format and try.
Hmm, the source can generate single plane formats but all those packed formats are 4:2:2 whereas the source is 4:2:0 and I want to end up encoding 4:2:0, so I’d be going through an undesirable conversion:
It turns out that the wonderful folk at Magewell are working on some enhancements to their SDK allowing both zero-copy and multi-planar capture which should solve all of this.
If anyone is researching capture cards; Magewell’s support is really good by the way.