The Jetson Nano H.264 encoder takes input from what is called the Output Plane, with pixel encoding and (width, height) defined, and generates encoded data onto the “Capture Plane”.
My question involves the Capture Plane. Apparently in addition to specifying pixel encoding and (width, height) for the Output Plane, we ALSO need to specify a pixel encoding and (width, height) for the Capture plane, BUT in theory this Capture Plane is a compressed bytestream, which would not be accessible as normal pixels until eventual decoding. Indeed the amount of data could wildly vary, between IDR frames versus incremental changes.
Can a kind developer explain the concepts of what data goes into this Capture Plane, how to calculate the (width, height) and pixel type for this, how to determine the length of the presumably variable length data to pull out of the Capture Plane (i.e. for network streaming), and ideally does this encoder output already have Annex B prefix bytes and the non-emulation bytes present in this “signal”, or do I need to add prefix bytes and non-emulation?
Intent here is to build in C a low latency pipeline from camera to H.264 encode, to network streaming (whether via RTP or otherwise). So, understanding concepts of what is IN the Capture Plane and how to deal with it, would be very helpful!
ctx.height, 2 * 1024 * 1024);
For high resolutions this may be too small. If the setting is too small, it is set to widthxheightx1.5 bytes(size of a YUV420 frame) in low-level code, so you should not need to change the value. Or you may set it to widthxheightx1.5 bytes identically.
Now, should I assume that this is simply a means to allocate a buffer, but should expect that the actual data (being compressed) will be of varying lengths per the NALU (not always filling this buffer full)?
Subquestion 1: is the capture plane data encoded by Nvidia to have Annex B prefix bytes and emulation prevention bytes?
Subquestion 2: can there be more than one NALU in a returned Capture Plane buffer? (i.e. back-to-back runs of bytes, each being a NALU)
Subquestion 3: does a NALU always exactly begin at this buffer’s start point (can the buffer be considered an alignment point for the start-of-NALU at byte0)?
By the way, thank you for mentioning ‘encoder_unit_sample’ which has helpful comments on usage. I’m tracing through that code now and might have followup questions later.
How would we know if we have under-sized the capture buffer for the bitstream being written to it (i.e. the encoder ran out of space)? Is there an error thrown, and will it tell us how much room would have been needed?
Also, do we receive some value of “bytes written” for each buffer emitted by the encoder?