TX2 OpenCV StereoBM CPU performance

We are getting weird results when using the TX2’s CPU. We are running an OpenCV Stereo Block Matching algorithm and it seems like the TX2’s performance is waaay slower than a standard x86 processor. The Jetson is running at 2000ms/frame and on a standard x86 processor it runs at 100ms/frame.

Does anyone happen to know what could be causing this x20 performance hit?

Hi,
Please run in max performance and try again.

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_tx2_32.html%23wwpID0E0VN0HA
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_tx2_32.html%23wwpID0E0KB0HA

Thanks for the response.

I’ve tried that already and had the same results.

Hi,
Please run ‘sudo tegrastats’ to check if the system is at max performance.
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2FAppendixTegraStats.html

Pure OpenCV application is CPU-based and some hardware blocks(GPU, NVENC, NVDEC) may not be leveraged. Would like to suggest you try gstreamer or tegra_multimedia_api.

Every single core is at 100% and at max speed. The only reason I’m asking the question is we really cannot quantify why the Jetson’s ARM CPU would be 20x worse than a regular x86 performance.

We have suspicions that it might be because of the L2 cache being 2mb.

Hi,
Not sure but OpenCV is mainly developed by Intel. It might run better on x86.

On Jetson platforms, we recommend use tegra_multimedia_api or gstreamer. Hardware acceleration is enabled in the two frameworks.