OMX hardware encoding CPU at 100% (SOLVED)

I’m not sure if this is due to hardware or limitations of the nano but the same pipelines I use for ROS omx video encoding on TX1 have 100% CPU usage on nano. Is this normal or am I missing a new pipline for hardware video encoding.

gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh264enc ! video/x-h264, framerate=30/1, stream-format=byte-stream ! h264parse ! rtph264pay config-interval=1 ! udpsink host=192.168.1.183 port=5600 t. ! queue ! videoconvert

Okay so this is strange. When using a USB 2.0 uvc capture device with omx hardware encoding I get 20% cpu usage but when I use my USB 3.0 1080p uvc capture device over USB 3.0 port my CPU usage goes over 100% with the same gst-launch pipeline.

I dont experience this at all on my Jetson TX1 or even the TK1 over USB 3.0.

Edit: This happens on both 2.0 and 3.0 the CPU load all depends on the resolution…

Hi,
Please share information about your v4l2 source:

$ v4l2-ctl -d /dev/video0 --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
	Index       : 0
	Type        : Video Capture
	Pixel Format: 'YUYV'
	Name        : YUYV 4:2:2
		Size: Discrete 1920x1080
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x480
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 800x600
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1024x768
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1280x720
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1280x960
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1280x1024
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1360x768
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1400x900
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 1440x900
			Interval: Discrete 0.017s (60.000 fps)
			Interval: Discrete 0.033s (30.000 fps)

Thanks for the response DaneLLL. So i did some more pipelines and yes since gst-launch is using the 1st available resolution which is 1920x1080 @60fps this causes the cpu usage to reach 100%. Now if i switch to the lowest resolution 640x480 @30fps the cpu usage drops to about 35%.

I don’t experience this at all on my Jetson TX1 but at the moment it is not a big deal for my ROS projects since I plan to use lower resolutions to broadcast my rtsp stream over wifi and 4G lte.

Is it normal on the nano to experience such high load when using a hardware encoding with high resolutions? I experience this at 1080 and 720.

Could someone else that is using gst-launch-1.0 with 1920x1080 @60fps comment on their experiences?

Here is my exact gst pipelines I use to stream rtspclientsink to my wowza server which broadcasts all my ROS streams:

H265 @ 750k 1080 @60 fps = 100% CPU usage

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh265enc bitrate=750000 ! video/x-h265, width=1920, height=1080, framerate=60/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

H264 @ 750k 720 @30 fps = 85% CPU usage

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh264enc bitrate=750000 ! video/x-h264, width=1280, height=720, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

H264 @ 750k 480 @ 30 fps = 35% CPU usage

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh264enc bitrate=750000 ! video/x-h264, width=1280, height=720, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano
1 Like

One thing to add is I do see a lot of kernel messages reporting:

[ 6553.821845] uvcvideo: Failed to resubmit video URB (-1).
[ 6556.216674] usb 2-1.3: usb_suspend_both: status 0
[ 6556.276498] usb 2-1: usb_suspend_both: status 0
[ 6556.282031] usb usb2: usb_suspend_both: status 0

I’ll look into these topics for the time being:

https://devtalk.nvidia.com/default/topic/1043290/jetson-tx2/usb-3-0-uvc-camera-issue-/

https://devtalk.nvidia.com/default/topic/937487/jetson-tx1/currupted-frames-from-ar1820-usb-camera/

Nothing I have tried reduces the CPU load. I am going to test the same pipelines again on TX1/TK1 and compare results again. Seems hardware encoding is not working if cpu is at 100%…

Hi,
Below is the result with E-CON CU135 running at 1080p60(r32.1), for your reference.

$ sudo jetson_clocks
$ gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=600 ! video/x-raw,format=UYVY,width=1920,height=1080,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! nvv4l2h264enc ! fpsdisplaysink video-sink=fakesink text-overlay=false -v
RAM 828/3957MB (lfb 643x4MB) IRAM 0/252kB(lfb 252kB) CPU [61%@1428,10%@1428,4%@1428,6%@1428] EMC_FREQ 13%@1600 GR3D_FREQ 0%@921 NVENC 652 APE 25 PLL@21.5C CPU@25C PMIC@100C GPU@23C AO@34C thermal@24C POM_5V_IN 3343/2038 POM_5V_GPU 119/37 POM_5V_CPU 636/220
RAM 828/3957MB (lfb 643x4MB) IRAM 0/252kB(lfb 252kB) CPU [60%@1428,7%@1428,2%@1428,10%@1428] EMC_FREQ 13%@1600 GR3D_FREQ 0%@921 NVENC 652 APE 25 PLL@21.5C CPU@25C PMIC@100C GPU@23C AO@34.5C thermal@24C POM_5V_IN 3343/2042 POM_5V_GPU 119/37 POM_5V_CPU 597/221
RAM 828/3957MB (lfb 643x4MB) IRAM 0/252kB(lfb 252kB) CPU [61%@1428,9%@1428,6%@1428,8%@1428] EMC_FREQ 13%@1600 GR3D_FREQ 0%@921 NVENC 652 APE 25 PLL@22C CPU@25C PMIC@100C GPU@23.5C AO@34C thermal@24C POM_5V_IN 3383/2046 POM_5V_GPU 119/38 POM_5V_CPU 635/222
RAM 828/3957MB (lfb 643x4MB) IRAM 0/252kB(lfb 252kB) CPU [61%@1428,8%@1428,6%@1428,6%@1428] EMC_FREQ 13%@1600 GR3D_FREQ 0%@921 NVENC 652 APE 25 PLL@22C CPU@25.5C PMIC@100C GPU@23.5C AO@34C thermal@24.25C POM_5V_IN 3383/2051 POM_5V_GPU 119/38 POM_5V_CPU 636/223

For your camera, you should configure it to ‘video/x-raw,format=YUY2,width=1920,height=1080,framerate=60/1’

Thank you for the very informative post DaneLLL. I tried your pipeline and it works as is but how do I incorporate the rtspclientsink and also encoding to a bitrate? I am broadcasting video over 4G LTE and video/x-raw is uncompressed so this is undesirable.

I get an input/output error when trying to append rtspclientsink:

./gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=600 ! video/x-raw,format=YUY2,width=1920,height=1080,framerate=60/1 ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! nvv4l2h264enc ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano
Setting pipeline to PAUSED ...
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL ...
Progress: (open) Opening Stream
Progress: (connect) Connecting to rtsp://192.168.1.133:1935/live/JetsonNano
Progress: (open) Retrieving server options
/GstPipeline:pipeline0/GstV4l2Src:v4l2src0.GstPad:src: caps = video/x-raw, format=(string)YUY2, width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)2:4:7:1, interlace-mode=(string)progressive
/GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:src: caps = video/x-raw, format=(string)YUY2, width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)2:4:7:1, interlace-mode=(string)progressive
/GstPipeline:pipeline0/Gstnvvconv:nvvconv0.GstPad:src: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, format=(string)NV12
/GstPipeline:pipeline0/GstCapsFilter:capsfilter1.GstPad:src: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, format=(string)NV12
/GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0.GstRtspClientSinkPad:sink_0.GstProxyPad:proxypad0: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
Progress: (open) Opened Stream
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Progress: (request) Sending RECORD request
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin/GstRtpBin:rtpbin0: latency = 2000
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin/GstRtpBin:rtpbin0: ntp-time-source = NTP time based on realtime clock
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin/GstRtpH264Pay:rtph264pay0: pt = 96
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin.GstGhostPad:ghostpad0.GstProxyPad:proxypad1: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin/GstRtpH264Pay:rtph264pay0.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0/GstBin:rtspbin.GstGhostPad:ghostpad0: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
/GstPipeline:pipeline0/GstRTSPClientSink:rtspclientsink0.GstRtspClientSinkPad:sink_0: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)NULL, level=(string)NULL, width=(int)1920, height=(int)1080, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)60/1, interlace-mode=(string)progressive, colorimetry=(string)bt709, chroma-site=(string)mpeg2
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
/GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0.GstPad:sink: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, format=(string)NV12
/GstPipeline:pipeline0/GstCapsFilter:capsfilter1.GstPad:sink: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, format=(string)NV12
/GstPipeline:pipeline0/Gstnvvconv:nvvconv0.GstPad:sink: caps = video/x-raw, format=(string)YUY2, width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)2:4:7:1, interlace-mode=(string)progressive
/GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:sink: caps = video/x-raw, format=(string)YUY2, width=(int)1920, height=(int)1080, framerate=(fraction)60/1, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)2:4:7:1, interlace-mode=(string)progressive
H264: Profile = 66, Level = 0

(gst-launch-1.0:27341): GStreamer-CRITICAL **: 09:34:54.748: gst_structure_set: assertion 'IS_MUTABLE (structure) || field == NULL' failed

(gst-launch-1.0:27341): GStreamer-CRITICAL **: 09:34:54.749: gst_structure_set: assertion 'IS_MUTABLE (structure) || field == NULL' failed
Redistribute latency...
ERROR: from element /GstPipeline:pipeline0/GstV4l2Src:v4l2src0: Device '/dev/video0' failed during initialization
Additional debug info:
gstv4l2object.c(3698): gst_v4l2_object_set_format_full (): /GstPipeline:pipeline0/GstV4l2Src:v4l2src0:
Call to TRY_FMT failed for YUYV @ 1920x1080: Input/output error
Execution ended after 0:00:05.532982551
Setting pipeline to PAUSED ...
Setting pipeline to READY ...

Just found this patch I will have to try: https://bug796789.bugzilla-attachments.gnome.org/attachment.cgi?id=372998

Reference: Bug 796789 – v4l2 initialization fails with TRY_FMT call

Well I built Gstreamer 1.14.4 (nano has 1.14.1) and now my omxh264enc seems to be using gst-omx hardware encoding because now the same pipelines are only using 15% CPU. Consider this solved and also for anyone experiencing slow omxh264enc performance use this guide and change your VERSION to VERSION=1.14.4 and build.

Here are the CPU results for Gstreamer 1.14.4:

H264 720 @30 and 60 fps 20% CPU

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh264enc bitrate=750000 ! video/x-h264, width=1280, height=720, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

H264 1080 @30 and 60 fps 35% CPU

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! videoconvert ! omxh264enc bitrate=750000 ! video/x-h264, width=1920, height=1080, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

Another update on this. Honey_Patouceul mentioned just to replace videoconvert with nvvidconv and this dropped the CPU usage even more.

CPU usage now at 25-30% with nvvidconv H264 1080p @60 fps

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! nvvidconv ! omxh264enc bitrate=750000 ! video/x-h264, width=1920, height=1080, framerate=60/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

Optimal streaming over wifi/4g lte
CPU usage 20-30%

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! nvvidconv ! omxh264enc bitrate=750000 ! video/x-h264, width=1280, height=720, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

H265 720/1080 with (memory:NVMM),format=NV12 for even less cpu (15 - 25% CPU usage)

./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! omxh265enc bitrate=750000 ! video/x-h265, width=1280, height=720, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano
./gst-launch-1.0 -vvv -e v4l2src device=/dev/video0 ! tee name=t ! queue ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! omxh265enc bitrate=750000 ! video/x-h265, width=1920, height=1080, framerate=30/1, stream-format=byte-stream ! rtspclientsink location=rtsp://192.168.1.133:1935/live/JetsonNano

Hi Linux4all,

I dont know why but I dont have rtspclientsink plugin on Nano.
Could you please help me installing this plugin?

Thanks

Hi,
Please check
https://stackoverflow.com/questions/47527480/warning-erroneous-pipeline-no-element-rtspclientsink

Hi DaneLLL,

thank you very much for your quick reply. I was able to install rtspclientsink plugin.

Many thanks