Is busy CPU usage when jetson nano hardware decoding is doing task?

LoveNvidia · April 10, 2020, 4:41pm

Hi guys,
I have some question about the hardware decoder of jetson nano.

1- I want a clean code for multi-stream decoding using python code
2- When nano is decoding multi-stream using special decoding hardware, Is also CPU busy for this task? if so, why? Is’t done this process only on hardware decoding?

DaneLLL · April 13, 2020, 1:21am

Hi,
You can run sudo tegrastats to get system status. If you see NVDEC in tegrastats, it is hardware decoding.

We enabl ehardware acceleration in tegra_multimedia_api and gstreamer. If you use gstreamer nvv4l2decoder element in your python code, it is hardware decoding.

LoveNvidia · April 13, 2020, 7:24am

Thanks a lot.

I use Jetpack 4.2 and opencv 3.4.

1- When I use these below gstreamer elements in opencv, I get some results:
CPU usage 30-45 %, when decoding 8 streaming with 1920x1080 resolution, and I also can decoding 9 streaming with 1920x1080 resolution, but Jetson nano support 8 stream, righ? why this problem can be happend?

rtspsrc location={uri} latency={latency} ! rtph265depay ! h265parse ! omxh265dec !
nvvidconv ! video/x-raw, width=(int){width}, height=(int){height}, format=(string)BGRx ! videoconvert ! appsink

sudo tegrastats:

RAM 447/3964MB (lfb 638x4MB) SWAP 0/10174MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [30%@1428,25%@1428,35%@1428,37%@1428] EMC_FREQ 11%@1600 GR3D_FREQ 0%@921 APE 25 PLL@37C CPU@40.5C PMIC@100C GPU@37.5C AO@47C thermal@38.75C POM_5V_IN 4534/4319 POM_5V_GPU 166/166 POM_5V_CPU 2450/2246

2- When I use these below gstreamer elemets:
I can decode 8-9 stream but even with only 4 stream the usage of CPU reach to 100%.

rtspsrc location={} latency={} ! rtph265depay ! h265parse ! avdec_h265 ! videoconvert ! appsink

Q1-I think the state 1 is use hardware accelerator for decoding the streams and state 2 is definitly use cpu decoding, right?

Q2-In your optinin, I add nvv4l2decoder into the state 1?

Q3-I saw some blogs that told, omxh265dec is for hardware decoding and avdec_h265 is for cpu decoding, what’s your opinion?

mdegans · April 13, 2020, 11:45pm

Q1

Yes, you have that right. A full guide to the accelerated elements can be found at the link below.

I add nvv4l2decoder into the state 1?

According to the documentation, omx decoders are deprecated, so you might want to use nvv4l2decoder anyway.

When nano is decoding multi-stream using special decoding hardware, Is also CPU busy for this task?

No. With CUDA, your gpu can be doing something (even many things), while your CPU does something else entirely. Here are some examples from PyTorch of how it can work.

LoveNvidia · April 14, 2020, 9:17am

Thanks.
In the Q3, my mean is that, As you see in my sudo tegrastats, usage of cpu is there, why? whereas I run only python code for multi-stream on hardware decoding. why cpu is busy? Is it common? or that is due to decoding process?

DaneLLL · April 14, 2020, 9:25am

Hi,
Please check

You have enabled hardware decoding but there is memcpy() between gstreamer and OpenCV. This takes CPU usage.

LoveNvidia · April 14, 2020, 9:50am

You mean is that I don’t use zero-copy memory method? I do duplicated data in memory? How to correct this problem? If is set appsink as gstream elemet, this problem can be solve?

I use these elements gstreamer in the opencv and the sudo tegrastats result is shown the state 1 above:

gst_str = f’rtspsrc location={uri} latency={latency} ! rtph264depay ! h264parse ! omxh264dec ! nvvidconv ! video/x-raw, width=(int){width}, height=(int){height}, format=(string)BGRx ! videoconvert ! appsink’

cv2.VideoCapture(gst_str, cv2.CAP_GSTREAMER)

DaneLLL · April 14, 2020, 10:02am

Hi,

For using appsink in OpenCV, this cannot be eliminated. Optimal pipeline on Jetson Nano is to have NVMM bufffers from source to sink. OpenCV is CPU-based stackand it only accepts CPU buffers in appink.

An optimal solution is to leverage CUDA APIs and tegra_multimedia_api. Please refer to

LoveNvidia · April 14, 2020, 10:06am

Optimal pipeline on Jetson Nano is to have NVMM bufffers from source to sink

add video / x - raw ( memory:NVMM ) in gstream elements?

I want to use rtsp streaming in python code. I don’t know, Can I use tegra_multimedia_api in python code or no.

DaneLLL · April 14, 2020, 10:10am

Hi,

No, tegra_multimedia_api is not supported in python.

The CPU usage is explained. If you have to use appink in OpenCV, please realize it.

LoveNvidia · April 14, 2020, 10:14am

Thanks a lot,
What’s diffrence between usage of video/ x- raw ( memory:NVMM) and video/ x- raw?
Please give me a optimal way of gstream elements for decoding multi-streams using hardware decosing with python appilication.

mdegans · April 14, 2020, 8:45pm

Here are some python examples and notebooks:

However if your idea is to modify the buffer (the image itself) within Python, that’s currently not possible. You can modify the metadata (eg. bounding box coordinates of detections, labels, etc), but not the image. It would be very slow in Python.

If you want to use OpenCV within DeepStream, there is an example plugin that does exactly that, however it’s written in c++. You could then use that plugin within Python.

That being said, your plugin will be much faster if you avoid any memory copies and keep the buffer in NVMM, which is GPU memory, so if you do use the example plugin linked above and you know some cuda, you may wish to tear out the OpenCV Mat conversion. As DaneLLL mentions, memory copies are expensive.

LoveNvidia · April 20, 2020, 11:09am

Hi
I use omx264dec element in gstreamer, I see NVDEC in tegrastats, that’s mean that nano use hardware decoder, right?
but I don’t know why i use nvv4l2decoder, the decoding is stop, I thinks I have to use another combine of elements for nvv4l2decoder, right?

gst-launch-1.0 rtspsrc location=rtsp latency=300 ! rtph264depay ! h264parse ! nvv4l2decoder ! nvvidconv ! ‘video/x-raw(memory:NVMM)’, ‘format=(string)BGRx’ ! videoconvert ! fakesink

DaneLLL · April 20, 2020, 1:58pm

Hi,
When you see NVDEC is shown in tegrastats, it means it is in use.

The pipelines looks not right. You should send videox-raw to videoconvert. Please try

gst-launch-1.0 rtspsrc location=rtsp latency=300 ! rtph264depay ! h264parse ! nvv4l2decoder ! nvvidconv ! video/x-raw,format=(string)BGRx ! videoconvert ! fakesink

LoveNvidia · April 20, 2020, 2:07pm

I run your suggest pipeline, then I get the below logging, but the NVDEC is not shown. and the cpu usage is stucked to zero.
when I use omxh264dec instead of nvv4l2decoder, the NVDEC is shown.

jnano@jnano-desktop:~$ gst-launch-1.0 rtspsrc location=rtsp://192.168.1.101:8554/1920x1080.264 latency=300 ! rtph264depay ! h264parse ! nvv4l2decoder ! nvvidconv ! ‘video/x-raw,format=(string)BGRx’ ! videoconvert ! fakesink
nvbuf_utils: Could not get EGL display connection
Setting pipeline to PAUSED …
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL …
Progress: (open) Opening Stream
Progress: (connect) Connecting to rtsp://192.168.1.101:8554/1920x1080.264
Progress: (open) Retrieving server options
Progress: (open) Retrieving media info
Progress: (request) SETUP stream 0
Progress: (open) Opened Stream
Setting pipeline to PLAYING …
New clock: GstSystemClock
Progress: (request) Sending PLAY request
Progress: (request) Sending PLAY request
Progress: (request) Sent PLAY request
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261

(gst-launch-1.0:8568): GStreamer-CRITICAL **: 18:38:57.211: gst_mini_object_unref: assertion ‘mini_object != NULL’ failed
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261

other question:
I want to push the gstream decoded result in python code, How I can solve this probem?

DaneLLL · April 20, 2020, 11:16pm

Hi,
Please download either video file and try
http://jell.yfish.us/

$ gst-launch-1.0 filesrc location= jellyfish-5-mbps-hd-h264.mkv ! matroskademux ! h264parse ! nvv4l2decoder ! nvoverlaysink

If you don’t see NVDEC in sudo tegrastats, We suggest upgrade to Jetpack 4.2.3 or 4.3.

LoveNvidia · April 21, 2020, 7:52am

I get this error:

nvbuf_utils: Could not get EGL display connection
Setting pipeline to PAUSED …
Opening in BLOCKING MODE
Pipeline is PREROLLING …
ERROR: from element /GstPipeline:pipeline0/GstFileSrc:filesrc0: Internal data stream error.
Additional debug info:
gstbasesrc.c(3055): gst_base_src_loop (): /GstPipeline:pipeline0/GstFileSrc:filesrc0:
streaming stopped, reason error (-5)
ERROR: pipeline doesn’t want to preroll.
Setting pipeline to NULL …
Freeing pipeline …

DaneLLL · April 21, 2020, 11:33pm

Please upgrade to pgrade to Jetpack 4.2.3 or 4.3

Topic		Replies	Views
Pipeline multi-stream hardware decoder RTSP with jetson nano Jetson Nano	2	929	October 14, 2021
H.264 Hardware decoder with Gstreamer+opencv+python Jetson Nano decoder	4	4196	October 18, 2021
Gradualy increased memory usage when use gstreamer + opencv Jetson Nano opencv , gstreamer	26	4061	October 18, 2021
Gradual increase usage ram of jetson nano when doing decoding Jetson Nano decoder	2	424	October 18, 2021
Use Hardware rescaling of jetson nano in python codes Jetson Nano decoder	11	847	October 18, 2021
Jetson goes curling (or, simultaneously viewing multiple IP-cams) Jetson TK1	33	12108	November 28, 2014
OMX hardware encoding CPU at 100% (SOLVED) Jetson Nano	15	3760	October 15, 2021
h.264 video codec of gstreamer Jetson Nano	24	6904	October 14, 2021
Multi-stream rtsp GStreamer + Opencv Jetson Nano rtsp , opencv , gstreamer	12	3085	October 18, 2021
Performance of omxh264dec Jetson Nano gstreamer	6	1184	October 18, 2021

Is busy CPU usage when jetson nano hardware decoding is doing task?

Related topics