My Jetson EGL code makes compiz use 95% of the CPU

I noticed today that my EGL code is making compiz run at 95% CPU. Would someone have a hint on what could cause such a load on compiz? In most cases, such problems are due to the GPU not being used at all to render OpenGL commands (i.e. it would be using Mesa). Now I don’t really see how that could be the case since I use a texture that comes from a decompressed frame from an MP4 video. (i.e. without the hardware handling that data, I don’t see how it would work).

In my code I just do those two things: glClear() + render the texture from the converter (i.e. the MP4 decompressor output is sent to a converter before it can be displayed by EGL—however, in my current test, the MP4 is not even loaded so the rendering loop is just glClear() + eglSwapBuffers()).

Anything else I should be looking into?

Note: when I stop my app. compiz goes back to under 1% usage as expected. So it’s definitely my app. causing the issue.

  • Update

I tested further adding start/end time between each part and the only one that takes time is the eglSwapBuffers(). It looks like instead of going to sleep until a signal wakes the thread back up, it uses a tight loop until it detects the VBL starting point. I have a counter to verify the FPS and it is correct (30 fps). So it all makes sense, but I don’t understand why the software couldn’t use the hardware interrupt to detect the next VBL…

I found a post which describes a similar issue: Compiz process with high CPU usage – our output devices are TVs and the negotiated refresh rate is 30.

Hi,
Please execute sudo jetson_clocks and share result of sudo tegrastats for reference. And the release version( $ head -1 /etc/nv_tegra_release ).

$ cat /etc/nv_tegra_release
# R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t186ref, EABI: aarch64, DATE: Fri Oct 16 19:37:08 UTC 2020

Here is the jetson_clocks although I don’t see the point. That’s not going to change how much CPU gets used by a process, is it?

$ sudo jetson_clocks
$ sudo jetson_clocks --show
SOC family:tegra194  Machine:Jetson-AGX
Online CPUs: 0-7
CPU Cluster Switching: Disabled
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu4: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu5: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu6: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
cpu7: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1190400 CurrentFreq=1190400 IdleStates: C1=0 c6=0 
GPU MinFreq=905250000 MaxFreq=905250000 CurrentFreq=905250000
EMC MinFreq=204000000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=1
Fan: speed=0
NV Power Mode: MODE_30W_ALL

Here are the stats WITH MY FIX (i.e. a clock_nanosleep() which should NOT be required):

$ sudo tegrastats
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [6%@1190,0%@1190,0%@1190,0%@1190,1%@1190,0%@1190,2%@1190,9%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@52.75C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2430/2430
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [5%@1190,2%@1190,0%@1190,0%@1190,0%@1190,0%@1190,2%@1190,9%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 10%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@53C PMIC@100C AUX@47C CPU@49.5C thermal@48.35C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2410
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [8%@1190,0%@1190,0%@1190,0%@1190,0%@1190,0%@1190,2%@1190,8%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 9%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@53C PMIC@100C AUX@47C CPU@49.5C thermal@48.2C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2403
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [5%@1190,0%@1190,0%@1190,0%@1190,0%@1190,0%@1190,4%@1190,10%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@53C PMIC@100C AUX@47C CPU@49.5C thermal@48.35C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2430/2410
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [6%@1190,1%@1190,0%@1190,0%@1189,0%@1190,0%@1190,6%@1190,10%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 10%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@53C PMIC@100C AUX@47C CPU@49.5C thermal@48.2C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2406
RAM 5849/31919MB (lfb 5509x4MB) SWAP 0/15959MB (cached 0MB) CPU [6%@1190,0%@1190,0%@1190,0%@1190,0%@1190,0%@1190,4%@1190,12%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@48.5C GPU@49C Tdiode@53C PMIC@100C AUX@47C CPU@49C thermal@48.2C Tboard@49C GPU 778/778 CPU 622/622 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2403

And here is another run when I turn off my nanosleep() trick, this time you see the CPUs with really high % and of course the tegra_clocks didn’t help any.

$ sudo tegrastats
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [9%@1190,1%@1190,2%@1190,0%@1190,2%@1190,0%@1190,1%@1190,92%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2390
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [5%@1190,0%@1190,0%@1190,0%@1190,0%@1190,0%@1190,2%@1190,97%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 9%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2390
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [7%@1190,2%@1190,2%@1190,0%@1190,0%@1190,0%@1190,2%@1190,94%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 4%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 778/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2390
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [21%@1190,8%@1190,0%@1190,0%@1190,0%@1190,0%@1190,8%@1190,71%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 9%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@50C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2390
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [25%@1190,2%@1190,6%@1190,0%@1190,0%@1190,0%@1190,2%@1190,73%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.7C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2430/2398
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [35%@1190,1%@1190,0%@1190,0%@1190,4%@1190,1%@1190,7%@1190,72%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 10%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 1% AO@48.5C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2396
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [28%@1190,0%@1190,0%@1190,0%@1190,1%@1190,0%@1190,1%@1190,73%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 1% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2021 CV 0/0 VDDRQ 777/777 SYS5V 2390/2395
RAM 5913/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [29%@1190,2%@1190,3%@1190,0%@1190,0%@1190,0%@1190,3%@1190,74%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2177/2040 CV 0/0 VDDRQ 777/777 SYS5V 2430/2400
RAM 5914/31919MB (lfb 5494x4MB) SWAP 0/15959MB (cached 0MB) CPU [8%@1190,0%@1190,0%@1190,0%@1190,0%@1190,0%@1190,1%@1190,97%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 10%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2038 CV 0/0 VDDRQ 777/777 SYS5V 2390/2398
RAM 5913/31919MB (lfb 5493x4MB) SWAP 0/15959MB (cached 0MB) CPU [7%@1190,0%@1190,2%@1190,6%@1190,0%@1190,0%@1190,3%@1190,87%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 0% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2036 CV 0/0 VDDRQ 777/777 SYS5V 2390/2398
RAM 5914/31919MB (lfb 5493x4MB) SWAP 0/15959MB (cached 0MB) CPU [7%@1190,1%@1190,0%@1190,0%@1190,0%@1190,0%@1190,8%@1190,90%@1190] EMC_FREQ 10%@1600 GR3D_FREQ 0%@905 NVDEC 345 NVDEC1 345 APE 150 MTS fg 0% bg 1% AO@49C GPU@49C Tdiode@53.25C PMIC@100C AUX@47.5C CPU@49.5C thermal@48.55C Tboard@49C GPU 777/777 CPU 777/777 SOC 2021/2035 CV 0/0 VDDRQ 777/777 SYS5V 2430/2400

Again, the culprit is the eglSwapBuffers() which apparently doesn’t know how to wait for the VBL with an interrupt… so instead it uses 100% of a CPU. If I sleep, it reduces the usage to about 10% as I have it now. But I’d like the eglSwapBuffers() to work instead.

Hi,
It looks to be similar as the implementation of NvEglrenderer. In renderInternal(), we have a condition wait:

        pthread_cond_timedwait(&render_cond, &render_lock,
                &last_render_time);

So that we can call eglSwapBuffers() in interval. Is it good for you to have the wait like this? Or you have to call eglSwapBuffers() directly without waiting?

That’s exactly what I also ended up doing. I am detecting the display refresh rate and adding a sleep just before SwapBuffers in case of not rendering with 60 Hz. I’d really like this to be fixed as well.

1 Like

Hi Dane,

Oh! You’re correct, there is a wait in there too! Weird. And it looks like you have it setup at exactly the amount of time it takes for one VBL which means you may miss some of them once in a while since a timed wait can be a little off “in the wrong direction”.

Anyway, that means I have it right, but the eglSwapBuffers() is not working as expected. Too bad. When I wrote my first instance and used glSwapBuffer() it worked as expected. No wait required…

Thank you.
Alexis

Hi,
We shall fix this in JP4.5. Please upgrade and give it a try. Thanks.

2 Likes