CPU frequency very low since the system started

lingsong.zheng · January 16, 2025, 9:04am

Hello, we have discovered a very rare issue with our product. After the system starts, the CPU frequency remains consistently low and the voltage is also abnormal. Please take a look at it. The log has been uploaded as an attachment. Base BSP is r35.3.1.Thank you.
tegrastats.txt (139.0 KB)
dmesg.txt (4.2 MB)

WayneWWW · January 16, 2025, 11:15am

What is the result of

grep “” /sys/class/hwmon/hwmon*/oc*

lingsong.zheng · January 20, 2025, 3:19am

I checked, and there are no files with “oc” in their names under the /sys/class/hwmon/hwmon0~3 directories.

/sys/class/hwmon/hwmon0# ls
device  name  power  subsystem  temp1_input  uevent

WayneWWW · January 20, 2025, 5:51am

What is the result of

ls -al /sys/class/hwmon/

lingsong.zheng · January 20, 2025, 6:20am

Like this

~# ls -al /sys/class/hwmon/
total 0
drwxr-xr-x  2 root root 0 Mar 27  2023 .
drwxr-xr-x 84 root root 0 Mar 27  2023 ..
lrwxrwxrwx  1 root root 0 Mar 27  2023 hwmon0 -> ../../devices/virtual/thermal/thermal_zone5/hwmon0
lrwxrwxrwx  1 root root 0 Mar 27  2023 hwmon1 -> ../../devices/platform/39c0000.tachometer/hwmon/hwmon1
lrwxrwxrwx  1 root root 0 Mar 27  2023 hwmon2 -> ../../devices/platform/pwm-fan/hwmon/hwmon2
lrwxrwxrwx  1 root root 0 Mar 27  2023 hwmon3 -> ../../devices/platform/c250000.i2c/i2c-7/7-0040/hwmon/hwmon3

WayneWWW · January 20, 2025, 1:23pm

What is the cpu frequency if you don’t give any load there and running with sudo jetson_clocks?

lingsong.zheng · January 21, 2025, 3:18am

Because this issue is relatively difficult to reproduce, I have executed the results on the module that previously had the problem as follows(currently the nx cpu freq is normal):

$ sudo jetson_clocks --show
SOC family:tegra194  Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-5
cpu0: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
GPU MinFreq=1109250000 MaxFreq=1109250000 CurrentFreq=1109250000
EMC MinFreq=204000000 MaxFreq=1866000000 CurrentFreq=1866000000 FreqOverride=1
DLA0_CORE:   Online=1 MinFreq=0 MaxFreq=1100800000 CurrentFreq=1100800000
DLA0_FALCON: Online=1 MinFreq=0 MaxFreq=640000000 CurrentFreq=640000000
DLA1_CORE:   Online=1 MinFreq=0 MaxFreq=1100800000 CurrentFreq=1100800000
DLA1_FALCON: Online=1 MinFreq=0 MaxFreq=640000000 CurrentFreq=640000000
PVA0_VPS0: Online=1 MinFreq=0 MaxFreq=819200000 CurrentFreq=819200000
PVA0_VPS1: Online=1 MinFreq=0 MaxFreq=819200000 CurrentFreq=819200000
PVA0_AXI:  Online=1 MinFreq=0 MaxFreq=601600000 CurrentFreq=601600000
PVA1_VPS0: Online=1 MinFreq=0 MaxFreq=819200000 CurrentFreq=819200000
PVA1_VPS1: Online=1 MinFreq=0 MaxFreq=819200000 CurrentFreq=819200000
PVA1_AXI:  Online=1 MinFreq=0 MaxFreq=601600000 CurrentFreq=601600000
CVNAS MinFreq=0 MaxFreq=576000000 CurrentFreq=576000000
FAN Dynamic Speed control=active hwmon2_pwm1=0
NV Power Mode: MODE_20W_6CORE

WayneWWW · January 21, 2025, 3:25am

Does your previous case run on MAXN mode? It sounds like a throttling case happened.

lingsong.zheng · January 21, 2025, 5:43am

Yes, I didn’t change the power mode. I want to know if this is a quality issue, a software issue, or a hardware power supply issue, and how to troubleshoot it.

WayneWWW · January 21, 2025, 5:51am

Hi,

It is not an issue. If you ran in maxN mode, it means this situation is probably due to over current. The system tries to protect the system so throttled your system frequency.

如果你聽不懂的話我可以用中文解釋一次. 感覺前面有些回應好像你沒有真的理解

lingsong.zheng · January 21, 2025, 7:43am

请问MODE_20W_6CORE是maxN模式么，这种模式有概率引起过流保护是么。

WayneWWW · January 21, 2025, 8:00am

Hi,

我看了一下你的dmesg. kernel跟dts都有更動過. 請問這個issue是否是在custom board上複製的?
請問你有嘗試在NV devkit上複製出這問題嗎?
你前面的/sys/class/hwmon/ 看來有缺少一些node. 感覺狀態有點奇怪.

lingsong.zheng · January 22, 2025, 6:48am

是的，是我们自己产品上面出现的这个问题，我这边查了一下配置，应该是下面两个配置没有打开的原因，请问这个没有打开会导致cpu降频么

+CONFIG_TEGRA23X_OC_EVENT=y
+CONFIG_TEGRA19X_OC_EVENT=y

这个问题在我们产品上也很小概率才会出现，大部分产品没有发现类似问题。在nvidia evb板上应该也很难复现

WayneWWW · January 22, 2025, 6:52am

麻煩請先打開. 這東西不能自己關掉. 我們沒辦法確定你這樣改完之後機器的行為…

lingsong.zheng · January 22, 2025, 6:57am

好的，由于功能要求，我们在设备树里面有对usb有一个修改

+       xusb_padctl@3520000 {
+               ports {
+                       usb2-0 {
+                               mode = "host";
+                               status = "okay";
+                       };
+               };
+       };

这个修改之后/sys/class/hwmon/hwmon5会消失，其他目录还在，请问这个是否存在风险，谢谢。

WayneWWW · January 22, 2025, 7:44am

請問你的"hwmon5"本來是link到哪個路徑？

lingsong.zheng · February 5, 2025, 1:08am

修改之前：

~$ ls -l /sys/class/hwmon/
total 0
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon0 -> ../../devices/virtual/thermal/thermal_zone5/hwmon0
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon1 -> ../../devices/platform/d280000.soctherm-oc-event/hwmon/hwmon1
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon2 -> ../../devices/platform/39c0000.tachometer/hwmon/hwmon2
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon3 -> ../../devices/platform/pwm-fan/hwmon/hwmon3
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon4 -> ../../devices/platform/3520000.xusb_padctl/usb2-0/3520000.xusb_padctl:ports:usb2-0:connector/power_supply/usb-charger/hwmon4
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon5 -> ../../devices/platform/c250000.i2c/i2c-7/7-0040/hwmon/hwmon5

修改之后：

~$ ls -l /sys/class/hwmon/
total 0
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon0 -> ../../devices/virtual/thermal/thermal_zone5/hwmon0
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon1 -> ../../devices/platform/d280000.soctherm-oc-event/hwmon/hwmon1
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon2 -> ../../devices/platform/39c0000.tachometer/hwmon/hwmon2
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon3 -> ../../devices/platform/pwm-fan/hwmon/hwmon3
lrwxrwxrwx 1 root root 0 Sep  8 05:58 hwmon4 -> ../../devices/platform/c250000.i2c/i2c-7/7-0040/hwmon/hwmon4

WayneWWW · February 5, 2025, 3:04am

消失的不是hwmon5.

是因為這一個不見了, 原本的hwmon5現在直接被enumerated成hwmon4. 對於原本的功能沒有影響.

lrwxrwxrwx 1 root root 0 Sep 8 05:58 hwmon4 → …/…/devices/platform/3520000.xusb_padctl/usb2-0/3520000.xusb_padctl:ports:usb2-0:connector/power_supply/usb-charger/hwmon4

lingsong.zheng · February 5, 2025, 3:09am

好的，请问这个降频的动作是哪里做的，是应用层有一个服务由于没有检测到oc的状态而主动降频的么

WayneWWW · February 5, 2025, 3:10am

請你先把oc event開回來之後我們才能討論…

Topic		Replies	Views
Why does the CPU frequency decrease in JP5.1.4 but not in JP5.1.2? Jetson Orin NX hw , kernel , ubuntu , jetson-inference	15	87	January 16, 2025
Source of scaling_available_frequencies Orin NX 8GB Jetpack 5.1.2 Jetson Orin NX nvbugs , performance	16	85	February 26, 2025
Cpufreq warrning: cpufreq: cpu4,cur:1850000,set:1984000,set ndiv:155 Jetson Orin NX kernel	5	149	July 15, 2024
Stability CPU frequency Jetson Orin NX system-setup	4	66	October 9, 2024
CPU frequency cannot be locked Jetson AGX Orin kernel , board-design	12	160	December 13, 2024
CPU cores always run at highest frequency... why? Jetson Nano kernel	26	1522	October 18, 2021
Xavier NX - soctherm: OC ALARM 0x00000001 Jetson Xavier NX	3	2255	October 18, 2021
Why cpuinfo_cur_freq fluctuation under max performance Jetson TX2 kernel	4	663	October 18, 2021
How to improve the working frequency of TX1 GPU Jetson TX1	6	845	October 23, 2019
Xavier NX Low EMC Frequency Jetson Xavier NX kernel , power	4	932	February 21, 2024

CPU frequency very low since the system started

Related topics