Performance of nvargus-daemon

GiantDwarf · May 11, 2020, 7:07am

Hello,
we are using the TX2 with R32.2.1 on a customer carrier board. The main application is to capture images from up to 6 image sensors (IMX290 1920x1080 30fps) connectd via CSI using the nvargus-daemon by the Multimedia API.

As we are now in the phase of optimizing the system it turned out that the nvargus-daemon consumes a lot of system performance.

As the camera images need to be further processed by OpenCV after ‘acquireFrame’ the yuv420 image is getting color converted (ARGB32) and lifted into cpu memory with the NvConverter.

The performance needed for one camera image is around 20% and scales with the number of processed images.
Using the NVConverter even introduces a non deterministic frame latency of 4~20ms!

%Cpu0 : 44.5 us, 46.5 sy, 0.0 ni, 5.7 id, 0.0 wa, 3.0 hi, 0.3 si, 0.0 st
%Cpu3 : 54.0 us, 40.6 sy, 0.0 ni, 5.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st
%Cpu4 : 55.1 us, 39.5 sy, 0.0 ni, 4.7 id, 0.0 wa, 0.7 hi, 0.0 si, 0.0 st
%Cpu5 : 57.0 us, 38.0 sy, 0.0 ni, 4.3 id, 0.0 wa, 0.3 hi, 0.3 si, 0.0 st
MiB Mem : 7871.1 total, 4465.4 free, 3200.1 used, 205.6 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 5564.7 avail Mem

4607 root 20 0 2644120 488096 364456 R 125.2 6.1 2:11.44 handle_cams_4
3743 root 20 0 18.7g 534972 48812 S 121.6 6.6 6:29.43 nvargus-daemon
4728 root 20 0 1522856 268424 197172 R 81.4 3.3 0:15.38 handle_cams_2

Question:
1.) Is there a more efficient way of converting the captured images
2.) Is there a way around the double color conversion used for OpenCV (YUV420->ARGB32 ->RGB)
3.) What is the experience by implementing an own cuda color conversion kernel and what is the expected performance impact? Is this approach working for multi process applications?

And last: What exactly is the nvargus-daemon doing to waste such a huge amount of system performance?
Is it a soft ISP not running on a dedicated cpu but on the cpu’s intended for customer use?

DaneLLL · May 11, 2020, 9:11am

Hi,
Please check if all sources can reach the framerate in running

$ gst-launch-1.0 nvarguscamerasrc maxperf=1 ! 'video/x-raw(memory:NVMM),format=NV12,width=1920,height=1080' ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=false -v

If yes, please try

$ gst-launch-1.0 nvarguscamerasrc maxperf=1 ! 'video/x-raw(memory:NVMM),format=NV12,width=1920,height=1080' ! nvvidconv ! video/x-raw,format=BGRx ! videoconvert ! video/x-raw,format=BGR ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=false -v

OpenCV takes video/x-raw,format=BGR in appsink. And we don’t support BGR in hardware converter engine. Please check

So for hooking with OpenCV, you need to do

nvvidconv ! video/x-raw,format=BGRx ! videoconvert ! video/x-raw,format=BGR

This takes certain CPU loading.

Or you may run

to call cv2.cvtColor(img, cv2.COLOR_YUV2BGR_I420) for format conversion.

GiantDwarf · May 12, 2020, 7:02am

Hello DaneLLL,
thank you for your reply. We are perfectly clear on how to do color conversion or get frames into OpenCV. As i told you we have a well working system. The only point is the resource usage of the nvargus-daemon where i try to get a clarification about the system load and if there is a way to increase the performance.

DaneLLL · May 28, 2020, 12:58am

Hi,
nvargus-daemon is to get hardware DMA buffers from ISP engine and pass the buffers to upper application. Should not take much CPU usage. You can check by running

$ sudo jetson_clcoks
$ gst-launch-1.0 nvarguscamerasrc ! ‘video/x-raw(memory:NVMM)’ ! fakesink
$ sudo tegrastats

It shows CPU usage at fixed frequency in tegrastats.

DaneLLL · May 28, 2020, 9:33am

Hi,
For your reference, on Xavier/r32.3.1, we have seen 11% CPU usage for 6-cam preview; roughly 2% CPU usage per camera:

Test command:
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
$ ./argus_camera -d 0 & (-d 1, 2, 3 4, 5)
Using top and switching Irix mode off (shift+i) to get average CPU usage.

This eliminates other factors and show the usage taken by Argus.

GiantDwarf · June 8, 2020, 3:49pm

Hi,
i verified your numbers on our system by using the argus with and without color conversion

no color conversion:

gst-launch-1.0 nvarguscamerasrc sensor-id=2 ! ‘video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12’ ! fakesink

irix mode off
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2761 root 20 0 18.0g 161052 40084 S 2.3 2.0 0:11.86 nvargus-daemon
3399 root 20 0 740808 24980 18352 S 0.6 0.3 0:02.85 gst-launch-1.0 n

irix mode on
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2761 root 20 0 19.1g 453412 46900 S 14.2 5.6 14:16.53 nvargus-daemon
4344 root 20 0 740836 24776 18216 S 4.0 0.3 0:00.32 gst-launch-1.0

activated color conversion

gst-launch-1.0 nvarguscamerasrc sensor-id=2 ! ‘video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12’ ! nvvidconv ! video/x-raw,format=BGRx ! videoconvert ! video/x-raw,format=BGR ! fakesink

irix mode off
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3598 root 20 0 761464 46400 19268 S 4.5 0.6 0:16.61 gst-launch-1.0
2761 root 20 0 18.1g 214796 46860 S 2.5 2.7 0:37.86 nvargus-daemon

irix mode on
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4038 root 20 0 761468 46204 19068 S 27.2 0.6 0:17.65 gst-launch-1.0
2761 root 20 0 19.1g 448016 46860 S 14.2 5.6 12:10.11 nvargus-daemon

I still see a cpu usage of around 14% cpu for each camera stream (irix mode on) for the argus daemon which scales with the number of cameras used.

DaneLLL · June 8, 2020, 11:35pm

Hi,
Please check difference of Irix mode on/off:

On multi-core CPU system, Irix mode off looks more significant. Or you may run tegrastats to get the loading of each core.

Topic		Replies	Views
Nvargus-daemon cpu usage Jetson Xavier NX camera	4	2383	July 13, 2022
Nvargus-daemon cpu usage using BufferOutputStream vs FrameConsumer Jetson Xavier NX camera	6	674	July 29, 2022
Libargus-Daemon Excessive CPU Utilization Jetson Xavier NX camera , gstreamer	6	686	October 18, 2021
Performance optimization help Jetson TX2	19	1100	October 18, 2021
Argus high cpu usage streaming cameras Jetson TX2 camera	14	2890	October 18, 2021
GStreamer nvvidconv performance/cost (UYVY to NV12 for nvv4l2h264enc) Jetson Xavier NX camera , gstreamer , nvbugs	13	1697	December 7, 2022
Performance difference between nvarguscamerasrc pipeline and lib_argus api when grabbing images from... Jetson AGX Xavier	7	1396	October 18, 2021
Perfect images with v4l2src, but green images with nvargus Jetson Orin NX camera	26	742	March 26, 2024
nvcamera-daemon ARM load Jetson TX1	17	2337	October 18, 2021
nvargus-deamon problem Jetson TX2	8	1456	October 18, 2021

Performance of nvargus-daemon

Related topics