Gstreamer pipeline using nvv4l2h264enc to write from shared memory

PEJOE-G · July 14, 2021, 5:13pm

Hi,

I have been struggling to find an accelerated gstreamer pipeline that works to write frames from a cv::Mat allocated with cudaMallocHost() to a file.

Currently we are using the following configuration, with some items omitted for brevity:

INPUT_FRAME_HEIGHT = 3040;
INPUT_FRAME_WIDTH = 4032;
INPUT_FPS = 30;

cv::Mat largeFrame(INPUT_FRAME_HEIGHT,INPUT_FRAME_WIDTH,CV_8UC3,uni_frameLargeAddress);
cv::cuda::GpuMat d_largeFrame(INPUT_FRAME_HEIGHT,INPUT_FRAME_WIDTH,CV_8UC3, uni_frameLargeAddress);

captureString = "nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM),width="+std::to_string(INPUT_FRAME_WIDTH)+",height="+std::to_string(INPUT_FRAME_HEIGHT)+",framerate=30/1 ! nvvidconv ! video/x-raw,format=BGRx ! videoconvert ! video/x-raw, format=BGR ! appsink";

cap.open(captureString,cv::CAP_GSTREAMER);

cap >> largeFrame; 

std::string gstreamer_pipeline = "appsrc ! videoconvert ! video/x-raw, width=" + std::to_string(largeFrame.cols) + ", height=" + std::to_string(largeFrame.rows) +", format=UYVY, framerate="+std::to_string(INPUT_FPS)+"/1 ! videoconvert ! omxh264enc insert-aud=true !  splitmuxsink  muxer=mpegtsmux location=/mnt/sdcard/videos/temp/%09d.ts max-size-time=2000000000";

 writer.open(gstreamer_pipeline, cv::CAP_GSTREAMER, -1, 30, cv::Size(largeFrame.cols, largeFrame.rows));

We want to do two things:

replace the egress pipeline with one that can use hardware to do the encoding

CPU is taking ~110ms to do the encoding. We would like to be in the sub 30ms range, with 2-4ms being ideal
while the path for the image write says “sdcard,” it is actually on an nvme SSD,

see if there is a faster way to load images from the camera into memory

Right now the image load takes ~7ms, while our resize operation only takes ~900us.

The camera we are using is based on the imx477, and is connected via CSI

Any help pointing us in the right direction would be greatly appreciated

PEJOE-G · July 14, 2021, 8:08pm

I wanted to update with an interesting data point: processing a 2MP image (1920 x 1080) to an h264 frame using the above pipeline takes 1.1ms, but processing a 12MP (3040 x 4032) to h264 image takes 110ms. I would have expected that since h264 is roughly n*log(n), I would see a 10x penalty ~ 10-15ms in the encode step. Is there some other factor that could be further degrading performance on the Jetson?

DaneLLL · July 15, 2021, 2:01am

Hi,
For using cv::cuda::gpuMat without extra memory copy, please refer to this sample:
Nano not using GPU with gstreamer/python. Slow FPS, dropped frames - #8 by DaneLLL

This should eliminate much CPU usage. Please take a look and give it a try.

PEJOE-G · July 15, 2021, 2:13am

Hi Dane,

Thanks for the comment, but I’m worried my question wasn’t explained well. I have correctly implemented a shared memory region, referenced by both a cv::Mat and a cv::cuda::GpuMat object instance. I am looking for direction on a gstreamer pipeline that takes this shared memory region and allows me to encode it as an h264 frame to a file.

I provide an example pipeline that works for this task in the second to last line of my code block

std::string gstreamer_pipeline = "appsrc ! videoconvert ! video/x-raw, width=" + std::to_string(largeFrame.cols) + ", height=" + std::to_string(largeFrame.rows) +", format=UYVY, framerate="+std::to_string(INPUT_FPS)+"/1 ! videoconvert ! omxh264enc insert-aud=true !  splitmuxsink  muxer=mpegtsmux location=/mnt/sdcard/videos/temp/%09d.ts max-size-time=2000000000";

I am also seeing very poor performance of this pipeline for a large (12MP / 36MB) cv::Mat, worse than O(n*log(n)) (h264) would suggest.

I would love something like
“appsrc ! video/x-raw(memory:NVMM) ! nvv4l2h264enc ! splitmuxsink muxer=mpegtsmux location=/file/to/be/appended/to.ts” but do not know gstreamer very well.

Do you have any suggestions?

DaneLLL · July 15, 2021, 3:32am

Hi,
By using NvBuffer APIs, you can get NvBuffer in appsink and send to appsrc. Please check the samples:
[get NvBuffer in appsink]
How to run RTP Camera in deepstream on Nano - #29 by DaneLLL
[send NvBuffer to appsrc]
Creating a GStreamer source that publishes to NVMM - #7 by DaneLLL

system · September 19, 2021, 7:12am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
GStreamer pipeline for accelerated streaming of USB camera Jetson Nano gstreamer	8	9384	December 5, 2022
Gstreamer encode h.264 got frame drops,can not support 1080p@60fps Jetson Xavier NX camera , gstreamer	6	2053	November 17, 2021
Optimising Gstreamer Pipeline for 1080p60 recording Jetson Nano camera , gstreamer	5	1540	October 15, 2021
Jetson Nano: streaming opencv frames via RTSP gstreamer pipeline Jetson Nano camera , opencv , gstreamer , python	3	341	July 17, 2024
How to accelerate usb camera video capture with OpenCV Python Jetson Nano camera , opencv	12	1894	March 15, 2023
Gstreamer pipeline(nvv4l2decoder) for .264 file Jetson Xavier NX gstreamer	9	2083	November 1, 2021
Gstreamer dual cameras jpeg udp streeaming crushes with wifi Jetson Nano wifi , gstreamer	3	1071	October 15, 2021
Can I capture still image during streaming video in gstreamer? Jetson Nano gstreamer	22	10779	October 18, 2021
Resize decoded streams using gstreamer+opencv with jetson nano Jetson Nano opencv , gstreamer	11	2812	October 18, 2021
Jetson Nano - Saving video with Gstreamer and OpenCV Jetson Nano camera , opencv , gstreamer , python	6	3126	June 23, 2022

Gstreamer pipeline using nvv4l2h264enc to write from shared memory

Related topics