Comparing performance of Libargus sample on Xavier NX

I’m investigating high CPU usage on our Xavier NX 6-camera setup.
Running the 13_multi_camera sample from Libargus with the following patch uses ~60% CPU on each of the 6 CPUs.
NVP Model: 20W 6CORE
Jetson Clocks: Active

Is this expected?

--- multi_camera_node_orig.cc	2022-11-03 09:47:13.674088272 +1100
+++ multi_camera_node_edits.cc	2022-11-03 09:46:55.493839507 +1100
@@ -44,13 +44,13 @@
 
 /* Constants */
 static const uint32_t MAX_CAMERA_NUM = 6;
-static const uint32_t DEFAULT_FRAME_COUNT = 100;
+static const uint32_t DEFAULT_FRAME_COUNT = 1000;
 static const uint32_t DEFAULT_FPS = 30;
 static const Size2D<uint32_t> STREAM_SIZE(640, 480);
 
 /* Globals */
 UniqueObj<CameraProvider> g_cameraProvider;
-NvEglRenderer *g_renderer = NULL;
+// NvEglRenderer *g_renderer = NULL;
 uint32_t g_stream_num = MAX_CAMERA_NUM;
 uint32_t g_frame_count = DEFAULT_FRAME_COUNT;
 
@@ -128,7 +128,7 @@
             ORIGINATE_ERROR("Failed to create EglOutputStreamSettings");
 
         iEglStreamSettings->setPixelFormat(PIXEL_FMT_YCbCr_420_888);
-        iEglStreamSettings->setEGLDisplay(g_renderer->getEGLDisplay());
+        // iEglStreamSettings->setEGLDisplay(g_renderer->getEGLDisplay());
         iEglStreamSettings->setResolution(STREAM_SIZE);
 
         m_outputStream.reset(iCaptureSession->createOutputStream(streamSettings.get()));
@@ -347,10 +347,10 @@
             {
                 /* Composite multiple input to one frame */
                 NvBufferComposite(m_dmabufs, m_compositedFrame, &m_compositeParam);
-                g_renderer->render(m_compositedFrame);
+                // g_renderer->render(m_compositedFrame);
             }
-            else
-                g_renderer->render(m_dmabufs[0]);
+            // else
+            // g_renderer->render(m_dmabufs[0]);
         }
 
         CONSUMER_PRINT("Done.\n");
@@ -374,10 +374,10 @@
     static bool execute()
     {
         /* Initialize EGL renderer */
-        g_renderer = NvEglRenderer::createEglRenderer("renderer0", STREAM_SIZE.width(),
-                                                      STREAM_SIZE.height(), 0, 0);
-        if (!g_renderer)
-            ORIGINATE_ERROR("Failed to create EGLRenderer.");
+        // g_renderer = NvEglRenderer::createEglRenderer("renderer0", STREAM_SIZE.width(),
+        //   STREAM_SIZE.height(), 0, 0);
+        // if (!g_renderer)
+        // ORIGINATE_ERROR("Failed to create EGLRenderer.");
 
         /* Initialize the Argus camera provider */
         g_cameraProvider = UniqueObj<CameraProvider>(CameraProvider::create());
@@ -447,7 +447,7 @@
         g_cameraProvider.reset();
 
         /* Cleanup EGL Renderer */
-        delete g_renderer;
+        // delete g_renderer;
 
         return true;
     }

hello Hommus,

may I know how you obtain 60% usage results? could you please enable top utility and toggle the Irix mode off for checking.
besides, you should configure maximum performance mode (keep clock frequencies, instead of varying) to evaluate the CPU usage.
we’ve tested this before. it’s configured to $ sudo nvpmodel -m 0 and also $ sudo jetson_clocks
checking CPU usages by running $ ./argus_camera, it shows ~12% CPU usage for 6-cam use-case. (tested on AGX Xavier.)

The original usage was obtained using jtop.

After changing to top with Irix mode off, jetson_clocks enabled, argus_camera returns:

  • 15W 2CORE: 45% CPU usage
  • 20W 6CORE: 22% CPU usage

hello Hommus,

I forgot to mention that…
since we don’t have 6-cam reference board for Xavier NX. we tested 6-cam use-case on AGX Xavier.

hello Hommus,

BTW,
it’s expected to result higher CPU usage since AGX Xavier got 8 CPU cores but Xavier NX only had 6 CPU cores.

So my CPU usage is to be expected?

hello Hommus,

may I know what’s the actual use-case?

you may try reduce the number of camera to see the CPU usage dropped.
I don’t have numbers since we don’t have 6-cam reference board for Xavier NX.

FYI,
it’s camera application and EGL streams consume CPU resources, mostly CPU usage was taken by argus_camera application due to buffer transmit.
besides, camera pipeline still need CPU to run lots of algorithms like AE, AWB, ToneMap…etc.

Use case: 6 CSI cameras → Libargus → ROS.
We are using Libargus to access the getSensorSofTimestampTsc function, for camera synchronisation.

Is there anyway to optimise the buffer transmit?

We’ve found that even after we disable any processing/rendering in our libargus application, the CPU usage is still around 60% according to jtop, htop and tegrastats.

these two are not align, please review your evaluation steps.

@JerryChang
The 22% is from TOP - Irix mode off.
The 60% is from tegrastats, htop and jtop.

Is there anyway to optimise the buffer transmit?

hello Hommus,

number of online CPU core it matters, you should toggle Irix mode off to obtain average CPU usage.
and… since you’ve running argus_camera to obtain this results. it’s 22% CPU usage for your actual camera use-case.

may I know what’s the expectation of total CPU usage?
had you tried argus_camera to enable AeLock, AwbLock… etc for checking?
BTW, you may also refer to MMAPI sample, such as 13_multi_camera and modify the code to disable EGL streams.

I provided a patch in my original message (at the top) that comments out the EGL display related code from sample 13_multi_camera. Is that what you mean by disabling EGL streams?

hello Hommus,

ya, that’s looks the right approach to disable EGL streams.

there also a simple gst pipeline to do launch camera stream without display and only shows frame-rates. you may try this to evaluate the CPU usage.
for example,
$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920, height=1080, framerate=30/1, format=NV12' ! nvvidconv ! 'video/x-raw(memory:NVMM),format=I420' ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.