Encode from cv::cuda::GpuMat

Hi All,
I have cv::cudaGpuMat as an input stream, and I need to encode it as in the sample encoder here:
HW Encoder

But the HW encoder’s input is YUV420; therefore, one solution might be to convert cv::cuda::GpuMat to YUV420 format first.

Can you please help me achieve that? Do you have any other suggestions?

Thank you very much.


If you use Jetpack 4 release, a possible solution is to allocate RGBA NvBuffer, map it to cv::cudaGpuMat and then you can convert it to YUV420 by calling NvBufferTransform()
There is no sample exactly fitting your use-case but you can check this patch to get further information:
LibArgus EGLStream to nvivafilter - #14 by DaneLLL

For Jetack 5, the solution is the same except you would need to use NvBufSurface APIs.

Please check the document to get more information about jetson_multimedie_api samples:
Jetson Linux API Reference: Sample Applications | NVIDIA Docs

Thank you, DaneLLL, for the support.
If I understand it correctly, what we can do is to:
1 - create an RGBA NvBuffer
2 - access to gpu memory of this NvBuffer with EGL stream
3 - copy cv::Cuda::GpuMat to this place where we access at step 2, above
4 - Convert it to YUV420 by NvBufferTransform()

With step 4, we already have YUV420 in the GPU; the only thing left is to ask the hardware encoder to use it and encode. This way, we avoid copying anything from the CPU. We are always in GPU.

But the only part I am struggling to understand is how to tell the encoder to use that memory of YUV420 packets to encode after step 4. In the sample examples, we always start with the packets in CPU, nv_buffer.

Can we please clarify if I am missing something?

Thank you

Sorry for the late response, is this still an issue to support? Thanks

For video encoding, please refer to this sample:


Hello kayccc,
Thank you for the response.
We’ve solved the issue as described above and our starting point was 01_video_encode.
But I think that this solution still has room to improve. Our input is cv::Cuda::GpuMat and we need to apply conversion and transform as described above, which takes around 30ms.

Thank you

1 Like

If you call NvBufferTransform() or NvBufSurfTransform() for the conversion, please enable the hardware converter at maximum clock to have optimal throughput:
Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5? - #3 by DaneLLL

Thank you for your quick response, DaneLLL,
I will do that.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.