I found sometimes, when the CPU/GPU temperature is over 80(Celsius), the fan PWM is still 0. Then my NX device board’s CPU/GPU temperature is more and more high, at last cause the device power down.
My fan mode is quite.
Why the fan PWN is still 0 when the CPU/GPU temperature is very hight?
We can try cool mode. But the only difference between cool and quite mode should be just the trip point temperature and corresponding hysteresis, rigt?
Where is the code(driver or service) to handle the trip point temperature and corresponding hysteresis? Can we check some logs( If we have or we can enable some logs), or add more logs to debug this issue?
We checked /sys/devices/pwm-fan/target_pwm is 0 when the temperature is very hight, and the fan was not turned on. When I run command echo 255 > /sys/devices/pwm-fan/target_pwm, I saw the fan started to run.
I saw the thermal driver code at kernel/nvidia/drivers/thermal. Those driver codes are built in kernel or by module? I just saw userspace_alert.ko module. Other thermal drivers are built in kernel, is it right?
I need test and debug this issue, can I reload/restart those thermal drivers to check if this issue can be resolved?
I think the setting is OK. Most of time, it works. But sometimes, the temperature is high but fan is not rolling. I suppose the driver code will check the temperature, when the temperature is over the threshold in table, it will set fan PWM. Maybe the driver code has issue, then the functionality is broken.
I see pwm_fan_set_cur_state function in “pwm_fan.c”, this function should be used to set fan pwm. But I did not find where the code call it. It is pwm_fan_cooling_ops’s .set_cur_state. I also did not find who call this ops’s set_cur_state.