I am extremely confused about the whole NvBufSurface layout. The documentation / header files are not very helpful for me.
Is there a high level overview somewhere for this? I see various fragments on the forum but never an explanation.
In my particular case, I am attached to the nvv4l2decoder src pad.
v4l2src ! nvv4l2decoder mjpeg=1 ! nvstreammux ! …
The input is 4096x2160 from CU135, format is UYVY.
In the src pad callback, from my prints, I get:
** width 4096 height 2160 pitch 4096 colorFormat 2 num_planes 3
** plane 0 w 4096 ht 2160 p 4096 o 0 psz 8912896 bpp 1
** plane 1 w 2048 ht 1080 p 2048 o 8912896 psz 2228224 bpp 1
** plane 2 w 2048 ht 1080 p 2048 o 11141120 psz 2228224 bpp 1
nvv4l2decoder output will be NV12 (YUV420) right? That is what above colorFormat says.
Why is plane 0 psize=8912896? 4096*2160 is 8847360 - that is 64K too large.
Same with plane 1 & 2 , too large
My first attempt was to simply perform a cudaMemcpy but the total dataSize does not make sense.
If I want to extract only the Y plane, what is the process?
I really do not need to convert to RGB since I just need the luma
If I want to feed this GPU buffer to a custom cuda kernel, what is the process? How do I make an in GPU memory copy that doesn’t impact DS processing? I just want to perform some histogram calculation, not modify the pipeline.
nvv4l2decoder is HW accelerated decoder, so the buffer it uses is special.
colorFormat 2 means NVBUF_COLOR_FORMAT_YUV420 format, it is not ordinary YUV420 format, but the Nvidia HW adapted format.
For your example:
plana 0 size 8912896 = 4096 x 2176 because the width and height for HW should be the multiple of 16.
plana 1 and 2 size 2228224 = 2048 x 1088 also obey the rules of multiple of 16.
That link says ‘The plugin accepts an encoded bitstream and uses the NVDEC hardware engine to decode the bitstream. The decoded output is in NV12 format.’ Where does is explain the format and padding? i cannot find it in the link you provided. Thanks.