Xorg occupies resources soaring, causing the system to freeze

We are using Jetson NANO module for machine vision development.
Now I encounter the problem of GPU error reporting: Xorg occupies resources soaring when the error is reported, causing the system to freeze and occasionally restart directly. Please see the attachment for the log. Please help analyze the reason, thank yousession.log (105.6 KB)

Could you point out which line has the error that is related to Xorg?

FYI, you have a lot of USB errors and i2c errors (and i2c is used for the HDMI cable’s DDC wire, which in turn communicates the monitor’s specs to the driver). I’d say there is some other hardware error and perhaps it shows up under Xorg, but Xorg may not be the actual cause. You might want to mention the specific carrier board if not a dev kit (or verify it is a dev kit), what release is flashed, and what might be attached to the unit (e.g., monitor, keyboard, mouse, camera, other).

Actually, the log looks like not finish the first boot up configuration at all. I don’t think this issue has anything to do with xorg…

Please share detail about your board.

We use a customized carrier board and develop and design ourselves. The L4T version is 32.4.2.
Attached is the latest log I intercepted, which contains information about GPU errors. During the period, I used the top command to display the resource usage of Xorg.

We have two NANO Modules, tested on the same custom carrier board, only one will have a sudden increase in resources occupied by Xorg.
Please help analyze the log, what is causing it? How to optimize it? Thank you
log-0512.txt (40.5 KB)

We use a customized carrier board and develop and design ourselves. The L4T version is 32.4.2.
Attached is the latest log I intercepted, which contains information about GPU errors.

We have two NANO Modules, tested on the same custom carrier board, only one will have a sudden increase in resources occupied by Xorg.
Please help analyze the log, what is causing it? How to optimize it? Thank you
log-0512.txt (40.5 KB)

Do you run any application to cause this error?

Will this module hit same error if it is on NV devkit?

Could you also dump the full dmesg?

Keep in mind that if your board layout is not an exact match for the dev kit, then you must customize the device tree. I see EDID errors, which uses an i2c protocol and talks to the monitor, and without this graphics cannot function normally.

What I see in the top listing for Xorg is not particularly large in resource usage. As an example, the first line of the top output is using 2.0% CPU and 1.1% memory. GPU shares with the system for RAM, and if your system RAM is 8GB, and you use 50%, then 4GB is not actually much for use in modern video or in many CUDA applications. At the end I do see 84.8%, and perhaps you are exceeding the available RAM since the operating system itself must also have some RAM (GPU is unable to use swap memory and relies on physical RAM).

Until the EDID is fixed though (via device tree routing the correct i2c to the DDC wire of HDMI) it is hard to debug much related to GPU with video (though I suppose non-HDMI use of the GPU would not be a problem).