I am currently developing a multimedia application using jetson multimedia API.
The basic flow of the application is as follows:
Input: 2 cameras using MIPI and LibArgus, 4K, 60FPS
Cameras are software synchronized using LibArgus API. Similar to syncSensor sample in LibArgus
The two cameras are exposed to my application as two Argus::OutputStream outputting Argus::STREAM_TYPE_EGL buffers.
The first step in my application collects two synchronized images, and composites them into a stacked NvBufSurface using the NvBufSurfTransformMultiInputBufCompositeBlend API. This gives me a single NvBufSurface that contains the two images stacked on top of each other.
I display the stacked NvBufSurface to the screen, this works very well and performs well.
Now the next step is the problem. In addition to rendering the stream on screen, I would like to feed these NvBufSurfaces to the embedded V4L2H265 Encoder efficiently. The examples I have read that do video encoding are using NvBuffers or v4l2_bufs as inputs.
I have read the examples using the NvVideoEncoder.h, but not found any hints on how to approach this.
Because of the very high memory bandwidth requirements, dual 4K60Hz, there is no way I can afford to copy the buffers to “normal” memory, and then write them back to “device” memory for doing the conversion. Eg. V4L2_MEMORY_TYPE_USER_PTR and similar is not going to work. I need an approach that is copy-free.
Any pointers on how to do this would be greatly appreciated!
The example is 2500 lines of dense, quite convoluted, V4L2+Nvidia specific code, with little explanation and comments, it is hard to learn from it.
From reading the code, it is clear to me that one needs a deep understanding of the V4L2+Nvidia subsystem, the different capture/output planes, how buffer ownership is handled between the application and encoder, etc. etc. There is simply so much that is not explained, that even with a pretty substantial systems programming background that I feel I have, I can not confidently modify these examples to do what I want.
Does there exist documentation that explains all these high-level details about how to use the Nvidia V4L2 encoders? It feels like I am missing a 10-20 page architecture explanation document that outlines how these subsystems work.
I guess I could get along without a deeper understanding as well, by just hacking around, but then I would need more pointers to how to feed an NvBufSurface to the encoder. All the code I read is giving the encoder NvBuffers, and I see no clear way do directly convert between them. Do you have more minimal examples that take NvBufSurface and encode it to video, or a list of steps of how to convert between them?
Hi,
r35_31_add_nvvideoencoder.zip is the patch demonstrating how to feed NvBufSurface to encoder. Please take a look. And apply it to 12 sample for a try. You can run 12 sample with a USB camera.
Hi, using the patch I was able to hack together something that does what I need.
Thanks for the help!
I still stand by my previous statement. These libraries are very underdocumented compared to their complexity. Examples alone are not enough to explain everything that is needed to work with them. Documents describing the different concepts of the libraries would be a low-hanging fruit saving you support requests here.