The Thor Devkit thermal_zone node has an abnormality and the temperature cannot be read

The problem background: The device starts up and operates normally. After being left idle for more than 12 hours overnight, the fan continues to run at high speed.

1、Timeout occurred when reading the temperature node file.

cat: /sys/devices/virtual/thermal/thermal_zone0/temp: Timer expired

2、The “tegrastats” command was unable to return

3、The nvpmodel command has failed and cannot return normally.

$ nvpmodel -q

NV Power Mode: 120W

1

NVPM ERROR: Error reading /sys/kernel/nvpmodel_clk_cap/emc: 62

NVPM ERROR: failed to read PARAM EMC: ARG MAX_FREQ: PATH /sys/kernel/nvpmodel_clk_cap/emc

NVPM ERROR: Error reading /sys/devices/platform/bus@0/8181200000.host1x/818c000000.pva0/clk_cap/pva0_vps: 62

NVPM ERROR: failed to read PARAM PVA0_VPS: ARG MAX_FREQ: PATH /sys/devices/platform/bus@0/8181200000.host1x/818c000000.pva0/clk_cap/pva0_vps

4、Kernel exception log

1970-01-02T00:43:47.817060+08:00 tegra-ubuntu-devkit kernel: thermal thermal_zone0: failed to read out thermal zone (-62)

1970-01-02T00:43:47.817090+08:00 tegra-ubuntu-devkit kernel: tegra-bpmp-i2c bpmp:i2c: failed to transfer message: -62

1970-01-02T00:43:47.817094+08:00 tegra-ubuntu-devkit kernel: tegra-bpmp-i2c bpmp:i2c: failed to transfer message: -62

1970-01-02T00:43:47.817105+08:00 tegra-ubuntu-devkit kernel: thermal thermal_zone2: failed to read out thermal zone (-62)

1970-01-02T00:43:47.817109+08:00 tegra-ubuntu-devkit kernel: thermal thermal_zone1: failed to read out thermal zone (-62)

1970-01-02T00:43:47.818967+08:00 tegra-ubuntu-devkit kernel: thermal thermal_zone3: failed to read out thermal zone (-62)

1970-01-02T00:43:47.818976+08:00 tegra-ubuntu-devkit kernel: nvvrs_pseq 5-003c: Failed to read interrupt register: 16, ret=-62

5、The Xorg process is abnormal, with 100% CPU usage and the display going black.

Is this test based on NV devkit?

Yes, the issue is in the thor devkit.

please help check if this issue is repeatable after you flash the board or it is randomly happneed.

This issue occurred only once and happened randomly.

Does it have to be with NV Power Mode: 120W or it could be reproduced with some other power mode?

This issue occurred only once, and it was in the 120W mode.