Unexpected results from 960x540 YUV 422 NvBuffer and conversion failure

(Cross posting from Unexpected results from 960x540 YUV 422 NvBuffer and conversion failure to a hopefully more relevant forum)

Hi, in our system we ingest various video inputs via hardware V4L2 inputs and have them written into DMA buffers we allocate prior to ingest beginning (We have methods for JetPack 4 and 5). These DMA buffers then have their frames scaled, their colour spaces converted to YUV420 during the NvBufferTransform or NvTransform calls. Things get shuffled around a bit more, and eventually the video frames are rendered on the screen directly from DMA with EGL.

The problem I’m facing is when the resolutions we’re ingesting are between 856x480 to 960x540 (that we’ve tried to far) where there appears to be a buffer pitch mismatch and the frame looks fragmented and incorrect. Other resolutions of 1920x1080, 1280x720, 640x360, and a few other lower ones all work and display fine (again all the input colourspaces are YUV 422). The only difference I can see is when inspecting the NvBufferParams (or the equivalent on JP5) and seeing how the pitch is over 2x the width whereas all other resolutions are 2x exactly. For example for 640x360 the pitch is 1280, and for 1920 it’s 3840, however for 960 it’s 2048 instead of the expected 1920. This is the only lead I have as to why the frame wouldn’t be correct for these specific resolutions.

We’re also seeing issues with NV12 at all resolutions however 1280x720 has a correct pitch of 1280 and the luminance looks correct but chrominance looks wrong.

Do you know any reason why the 960 pitch acts like this? Is there anything we can do to try adjust the pitches back?

I’m seeing the same behaviour in the jetson_multimedia_api 12_camera_v4l2_cuda sample with the launch string of ./camera_v4l2_cuda -d /dev/video0 -s 960x540 -f YUYV -v
Occurs on Jetpack 4.6 (On Xavier NX) and Jetpack 5.1.2 (On Orin NX)

YUV422 960x540:

NV12 1080p:

This is expected since there is alignment for DMA buffer. For certain resolution, pitch is not equal to width and the frame data has to be copied line by line. To match the alignment of DMA buffer. In the 12 sample, there’s mechanism to check this. If the frame data cannot fit alignment, the frame will be captured to CPU buffer first, and then copy to DMA buffer.

I find it very odd that a pitch of 1920 is not supported by the DMA alignment! Is there any documentation anywhere that talks about the memory alignment that would let me deduce what is and isn’t supported directly?

Secondly, the first image I provided is from the default 12 camera sample, so the mechanism internally doesn’t work. (I had a look for it through the sample but I’m not sure where it is.)

Please check the 12 sample and there is an if condition:

        if (ctx->cam_pixfmt == V4L2_PIX_FMT_GREY &&
            pSurf->surfaceList[0].pitch != pSurf->surfaceList[0].width)
            ctx->capture_dmabuf = false;

Please modify it to

        if (pSurf->surfaceList[0].pitch != pSurf->surfaceList[0].width)
            ctx->capture_dmabuf = false;

And try with your camera source.

You can get pitch, width, height of an NvBufSurface from surfaceList[0].pitch, surfaceList[0].width, surfaceList[0].height.

I saw this part and figured this was what you may have been referring to but figured I should just wait and see. Won’t this mean that no camera source will ever use DMA ingest with the sample? Opening a 1920x1080 YUV420 NvBuffer with pitched memory results in a pitch of 2048, meaning you would force CPU to always be used no matter what.

I hardly see this as a solution, and am still seeking DMA alignment information.

You can get data alignment by checking the parameters:

Please check if your source can generate frame data to fit the alignment. If not, please capture to CPU buffer first and then copy to NvBufSurface.