I’ve been struggling with this problem for a while now, and i couldn’t find a working solution.
The kworker process constantly uses 100% of a CPU and blocks everything, I can’t even shut the NVidia down.
I know it’s a known problem in the community, but all the solutions I found (this, for example [url]12.04 - Why does kworker cpu usage get so high? - Ask Ubuntu) seem to pass through the ACPI interrupts manager, which is (apparently) not present on my board.
In my current setting, I have connected to the board a CAN bus and a LiDAR talking through ethernet, which are both using interrupts (as long as I know).
Though the %utilization shown is wrong.
Can you tell me when the utilization shoots up? When you have just booted the system and its idling? Or when can bus is busy or whe LiDAR is talking through ethernet? Is the issue see without ethernet on?
I am just trying to locate where is the problem? Locally we have not seen it.
Can you also check in your system what is this kworker doing? For example using ftrace
Thanks for your answer.
Unfortunately, I don’t have access to the NVidia in this moment, as soon as I can ftrace the problem I’ll post the result here.
The problem usually appears when both CAN and Ethernet are connected and talking. After it appears, there’s no way of stopping it, and stays constant to that percentage even when idling.
Please notice that I’ve observed this problem using different Lidars and Transceiver CAN, so I tend to exclude a hardware/software problem from that side.
there is display related SMMU error. Wrong address which is out of display mapped region is trying to be accesses, which is throwing these errors. But I don’t think you are bothered about those.
One thing is, CPU0 is only spewing this error. Not doing any workqueue job. If CPU 0 is stuck, then this could be the reason.
Regarding question 1, I agree with you. It seems that the display is giving problems. After reading the ftrace we’ve been working without any display attached and the problem has not appeared since then. Our guess is that is the problem. Is it a reasonable guess in your opinion? How to solve it?
Regarding question 2, the PID was 55. I don’t think it was kworker/0:3. What I’ve seen is that the kworker hogging cpu changes from time to time.
Yeah, we should fix the SMMU display issue.
What display panel you are using. Over HDMI or over DP?
Are you using Jetson or your customized Hardware?
Can you share the boot log?
Thanks for the log.
Can you boot without HDMI connected and then connect after boot?
This issue was fixed in latest release, what release version you are using?