Very bad performance, or is something wrong?

I managed to do my first experiments. I connected two cameras to the Jetson Nano. Both 1600x1300 at 60fps. I use OpenCV and wrote a loop that just read frames into a buffer (C++). But it doesn’t do anything with that buffer yet.
When running this and checking the system monitor, I see that 3 of the 4 CPUs are for 70% busy! But I’m doing nothing with the frames! CPUs should be mostly idle. Only data is moved and that probably is done with DMA.
I need to write some code that will analyse these frames, but it seems I will not have enough power for that.
Also the GPU is busy (changing between 0 en 80%)! Why?
Finally the 4GB ram is used for 80%! Yes I have big images, but one image is 1600x1300x2 (max) thus 4 MBytes. I assume it is only storing 1 or 2 pictures per channel (otherwise the latency will be very bad.

Is the conclusion that a Jetson Nano is way to weak to handle his??
I was planning to switch to Jetson Orin Nano later, but then with 6 cameras!
I hope I’m doing something wrong and this can be improved a lot.

Obviously the ARM cores of Nano (derived from TX1) are not as efficient as recent desktop CPU cores.

So you would have to consider that for this amount of pixel rate you would avoid as much as possible CPU processing. With jeston you would have dedicated HW for video scaling, format converting, encoding/decoding and GPU for image processing.

You may better explain your case:

  • What kind of cameras (CSI or USB), available formats
  • Share the code you’re using for opencv (opening cameras, reading from these, displaying). Also share the output of opencv function getBuildInformation().
  • For better advice, tell what kind of processing you intend to do from source to sink.

The thing is that I’m not doing any processing at all! CPUs should be idle.

This is the OpenCV code:

using namespace cv;
using namespace std;

int main(int argc, char** argv )
    VideoCapture cap1;
    Mat CameraFrame1;
    VideoCapture cap2;
    Mat CameraFrame2;;;

    // Check whether user selected camera is opened successfully.
    if (!cap1.isOpened())
        cout << "***Could not initialize capturing...***\n";
        return -1;
    if (!cap2.isOpened())
        cout << "***Could not initialize capturing...***\n";
        return -1;

    // Loop infinitely to fetch frame from cameras.
    int frame = 0;
    for (;;)
        cap1 >> CameraFrame1;
        cap2 >> CameraFrame2;

        // Check whether received frame has valid pointer.
        if (CameraFrame1.empty())
        if (CameraFrame2.empty())

        cout << "frame" << frame << endl;

        // Wait for Escape keyevent to exit from loop
        char keypressed = (char)waitKey(10);
        if (keypressed == 27)


Output of getBuildInformation():

General configuration for OpenCV 4.1.1 =====================================
  Version control:               4.1.1-2-gd5a58aa75

    Timestamp:                   2019-12-13T17:25:11Z
    Host:                        Linux 4.9.140-tegra aarch64
    CMake:                       3.10.2
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               Release

  CPU/HW features:
    Baseline:                    NEON FP16
      required:                  NEON
      disabled:                  VFPV3

    Built as dynamic libs?:      YES
    C++ Compiler:                /usr/bin/c++  (ver 7.4.0)
    C++ flags (Release):         -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--gc-sections  
    Linker flags (Debug):        -Wl,--gc-sections  
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:          dl m pthread rt
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo python2 python3 stitching ts video videoio
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 java js
    Applications:                tests perf_tests examples apps
    Documentation:               NO
    Non-free algorithms:         NO

    GTK+:                        YES (ver 2.24.32)
      GThread :                  YES (ver 2.56.4)
      GtkGlExt:                  NO

  Media I/O: 
    ZLib:                        /usr/lib/aarch64-linux-gnu/ (ver 1.2.11)
    JPEG:                        /usr/lib/aarch64-linux-gnu/ (ver 80)
    WEBP:                        build (ver encoder: 0x020e)
    PNG:                         /usr/lib/aarch64-linux-gnu/ (ver 1.6.34)
    TIFF:                        /usr/lib/aarch64-linux-gnu/ (ver 42 / 4.0.9)
    JPEG 2000:                   build (ver 1.900.1)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    FFMPEG:                      YES
      avcodec:                   YES (57.107.100)
      avformat:                  YES (57.83.100)
      avutil:                    YES (55.78.100)
      swscale:                   YES (4.8.100)
      avresample:                NO
    GStreamer:                   YES (1.14.5)
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            TBB (ver 2017.0 interface 9107)

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Lapack:                      NO
    Eigen:                       YES (ver 3.3.4)
    Custom HAL:                  YES (carotene (ver 0.0.1))
    Protobuf:                    build (3.5.1)

  Python 2:
    Interpreter:                 /usr/bin/python2.7 (ver 2.7.15)
    Libraries:                   /usr/lib/aarch64-linux-gnu/ (ver 2.7.15+)
    numpy:                       /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.13.3)
    install path:                lib/python2.7/dist-packages/cv2/python-2.7

  Python 3:
    Interpreter:                 /usr/bin/python3 (ver 3.6.9)
    Libraries:                   /usr/lib/aarch64-linux-gnu/ (ver 3.6.9)
    numpy:                       /usr/lib/python3/dist-packages/numpy/core/include (ver 1.13.3)
    install path:                lib/python3.6/dist-packages/cv2/python-3.6

  Python (for build):            /usr/bin/python2.7

    ant:                         NO
    JNI:                         NO
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    /usr

[1] + Done                       "/usr/bin/gdb" --interpreter=mi --tty=${DbgTerm} 0<"/tmp/Microsoft-MIEngine-In-bhqcxvwo.wnd" 1>"/tmp/Microsoft-MIEngine-Out-uqaoclor.llt"

What I need to do:
We have a C++ algotithm to detect dots and extract pose information from that. It runs on a 86 CPU (2.5 GHz) within 1ms per frame (on one thread). So I assume on the Jetson Nano I should be able to do that in 16 ms.

My biggest concern now is that CPU are loaded with 70% while they are doing nothing.

This might use V4L or FFMPEG backend and may result in CPU load. Since your opencv build also supports gstreamer, you may give it a try. You may tell the modes provided by your cameras with:

sudo apt install v4l-utils

v4l2-ctl -d0 --list-formats-ext
v4l2-ctl -d1 --list-formats-ext

You may also boost your jetson NVP model and clocks:

sudo nvpmodel -m 0
sudo jetson_clocks

Both cameras only support 8 or 16 bit greyscale. No compression.

	Index       : 0
	Type        : Video Capture
	Pixel Format: 'GREY'
	Name        : 8-bit Greyscale
		Size: Discrete 640x480
			Interval: Discrete 0.006s (180.000 fps)
		Size: Discrete 1280x720
			Interval: Discrete 0.011s (90.000 fps)
		Size: Discrete 320x240
			Interval: Discrete 0.004s (280.000 fps)
		Size: Discrete 1600x1300
			Interval: Discrete 0.017s (60.000 fps)

	Index       : 1
	Type        : Video Capture
	Pixel Format: 'Y16 '
	Name        : 16-bit Greyscale
		Size: Discrete 640x480
			Interval: Discrete 0.006s (180.000 fps)
		Size: Discrete 1280x720
			Interval: Discrete 0.011s (90.000 fps)
		Size: Discrete 320x240
			Interval: Discrete 0.004s (280.000 fps)
		Size: Discrete 1600x1300
			Interval: Discrete 0.025s (40.000 fps)


sudo nvpmodel -m 0
sudo jetson_clocks

makes no difference. What do they do?

Does OpenCV use CPU to copy frame buffers arround? Or is that done using DMA controllers?

nvpmodel -m 0 would select MAXN profile (max performance)
jetson_clocks would boost CPU, GPU and memory clocks.

Does this work ?

cv::VideoCapture cap1("v4l2src device=/dev/video0 ! video/x-raw,format=GRAY8,width=1600,height=1300,framerate=60/1 ! appsink drop=1", cv::CAP_GRSTREAMER);

# Or
cv::VideoCapture cap1("v4l2src device=/dev/video0 io-mode=2 ! video/x-raw,format=GRAY8,width=1600,height=1300,framerate=60/1 ! appsink drop=1", cv::CAP_GRSTREAMER);


Both make no difference.

(I assume you ment CAP_GSTREAMER at the end)

I tried

cv::VideoCapture cap1("v4l2src device=/dev/video0 ! video/x-raw,format=GRAY8,width=640,height=480,framerate=60/1 ! appsink drop=1", cv::CAP_GSTREAMER);

(lower resolution)
to see if it gets better. But the resolution still becomes 1600x1300. So is the line not working at all?

P.S.: where can I find the syntax of this configuration string?

I could limit the resolution this way:;;

    cap1.set(cv::CAP_PROP_FRAME_WIDTH, 640);
    cap1.set(cv::CAP_PROP_FRAME_HEIGHT, 480);

    cap2.set(cv::CAP_PROP_FRAME_WIDTH, 640);
    cap2.set(cv::CAP_PROP_FRAME_HEIGHT, 480);

This would result in a factor 8 less pixels.
But now all cores are above 80%!
I’m very confused now.

Note that I used the cap.set() after openening the VideoCapture. Otherwise it did not result in smaller images.
I’m not sure if that is correct.

I did find out that there is an unwanted conversion going on from gray scale to rgb. And I am not able to prevent this by setting the format.
But I guess I should make a new thread for this problem.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.