Jetson Nano CPU load average for CSI video recording unusually high

We are also observing very high CPU usage for just simply streaming two CSI2 cameras.

Background

This forum has previously discussed unexplainable and concerningly high CPU loads when using the Nvidia Argus CSI2 Deepstream plugin (nvarguscamerasrc).

This reply aims to provide a concrete example that anyone with a Jetson device (we’re testing on Jetson NX) can run to see the high CPU usage for themselves.

Goal

This post is seeking to understand why the overhead is so high, if it can be reduced, and if so how to reduce it. We additionally would like for everyone to be able to quickly run a couple of the same expirements we have and to observe the overhead for themselves.

Observing the Overhead

For these expirements we will use up to two IMX219-160 cameras, connected to the Jetson NX developer kit’s CSI2 ribbon cable connectors. You should configure the resolution and framerate in the expirements to values supported by your own cameras.

For a single CSI2 camera, you can run the following:

gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM), width=(int)3280, height=(int)2464, format=(string)NV12, framerate=(fraction)21/1' ! nvvidconv ! queue ! fakesink

To terminate this pipeline you can use CTRL-C in your terminal at any time to send SIGINT.

While this is running open a new terminal and observe the CPU usage of /usr/sbin/nvargus-daemon. One way to do this is to run htop (installed via sudo apt install htop) and click on the CPU column to filter processes by order of CPU load. The nvargus-daemon process should appear at or near the top with around 18% CPU load.

An alternative means of viewing this is to run:

top -p `pgrep "nvargus"`

For two cameras you can run:

gst-launch-1.0 nvarguscamerasrc sensor-id=1 ! 'video/x-raw(memory:NVMM), width=(int)3280, height=(int)2464, format=(string)NV12, framerate=(fraction)21/1' ! nvvidconv ! queue ! fakesink nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM), width=(int)3280, height=(int)2464, format=(string)NV12, framerate=(fraction)21/1' ! nvvidconv ! queue ! fakesink

Observations

For the MODE_10W_4CORE with the clocks like so:

Online CPUs: 0-3
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
cpu4: Online=0 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
cpu5: Online=0 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=1 c6=1
GPU MinFreq=114750000 MaxFreq=803250000 CurrentFreq=114750000
EMC MinFreq=204000000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=0
Fan: PWM=130
NV Power Mode: MODE_10W_4CORE

We observed ~18% CPU utilization for a single camera and up to ~37% with two cameras.

Then we re-ran with the max clock frequency (using sudo jetson_clocks) and got:

SOC family:tegra194  Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-3
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu4: Online=0 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu5: Online=0 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0
GPU MinFreq=803250000 MaxFreq=803250000 CurrentFreq=803250000
EMC MinFreq=204000000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=1
Fan: PWM=130
NV Power Mode: MODE_10W_4CORE

We saw similar values with the other clock frequency settings.