I wanted to update with an interesting data point: processing a 2MP image (1920 x 1080) to an h264 frame using the above pipeline takes 1.1ms, but processing a 12MP (3040 x 4032) to h264 image takes 110ms. I would have expected that since h264 is roughly n*log(n), I would see a 10x penalty ~ 10-15ms in the encode step. Is there some other factor that could be further degrading performance on the Jetson?
Thanks for the comment, but I’m worried my question wasn’t explained well. I have correctly implemented a shared memory region, referenced by both a cv::Mat and a cv::cuda::GpuMat object instance. I am looking for direction on a gstreamer pipeline that takes this shared memory region and allows me to encode it as an h264 frame to a file.
I provide an example pipeline that works for this task in the second to last line of my code block
I am also seeing very poor performance of this pipeline for a large (12MP / 36MB) cv::Mat, worse than O(n*log(n)) (h264) would suggest.
I would love something like
“appsrc ! video/x-raw(memory:NVMM) ! nvv4l2h264enc ! splitmuxsink muxer=mpegtsmux location=/file/to/be/appended/to.ts” but do not know gstreamer very well.