Fan PWM is 0 when CPU/GPU temperature is over 80

I found sometimes, when the CPU/GPU temperature is over 80(Celsius), the fan PWM is still 0. Then my NX device board’s CPU/GPU temperature is more and more high, at last cause the device power down.

My fan mode is quite.

Why the fan PWN is still 0 when the CPU/GPU temperature is very hight?

Which JetPack version you used? are you on custom design board or devkit?
We may need to try to investigate if can reproduce the error.

See below information:

  • NVIDIA Jetson Xavier NX (Developer Kit Version)
    • Jetpack 4.6 [L4T 32.6.1]
    • NV Power Mode: MODE_15W_6CORE - Type: 2
    • jetson_stats.service: active
  • Board info:
    • Type: Xavier NX (Developer Kit Version)
    • SOC Family: tegra194 - ID:25
    • Module: P3668 - Board: P3509-000
    • Code Name: jakku
    • CUDA GPU architecture (ARCH_BIN): 7.2
    • Serial Number: 1424920027702

Hi,
Could you try fan mode “cool” and check if the issue still exists? The modes are listed in developer guide

We can try cool mode. But the only difference between cool and quite mode should be just the trip point temperature and corresponding hysteresis, rigt?

Where is the code(driver or service) to handle the trip point temperature and corresponding hysteresis? Can we check some logs( If we have or we can enable some logs), or add more logs to debug this issue?

Hi,
Please check
Read the PWM fan RPM in Jetson NX - #5 by JerryChang

So cur_pwm is zero sometimes when temperature is over 80? Do you observe it on developer kit?

We checked /sys/devices/pwm-fan/target_pwm is 0 when the temperature is very hight, and the fan was not turned on. When I run command echo 255 > /sys/devices/pwm-fan/target_pwm, I saw the fan started to run.

I did not check the cur_pwm.

Hi,
target_pwm is for manual control. Please check this section

cur_pwm should show the current mode. Please check if it meets expectation in over 80 degree. If not please check the driver code

Though I did not check cur_pwm, I can see the fan was not rotating at all when the temperature is over 80.

If I want to set fan’s pwm, what is the best way to do it?

Hi,
You may manually control the fan by setting target_pwm

I saw the thermal driver code at kernel/nvidia/drivers/thermal. Those driver codes are built in kernel or by module? I just saw userspace_alert.ko module. Other thermal drivers are built in kernel, is it right?

I need test and debug this issue, can I reload/restart those thermal drivers to check if this issue can be resolved?

Hi,
The driver is built in kernel and initializedin booting. You may add some prints in driver code and check if the behavior is expected.

For Xavier NX, the setting is in tegra194-p3509-0000-a00.dtsi:

	pwm-fan {
		pwms = <&tegra_pwm6 0 45334>;
		vdd-fan-supply = <&p3509_vdd_fan>;
		profiles {
			default = "quiet";
			quiet {
				state_cap = <4>;
				active_pwm = <0 130 160 200 255 255 255 255 255 255>;
			};
			cool {
				state_cap = <4>;
				active_pwm = <0 140 170 200 255 255 255 255 255 255>;
			};
		};
	};
	thermal-fan-est {
		profiles {
			default = "quiet";
			quiet {
				active_trip_temps = <0 46000 60000 68000 76000
						140000 150000 160000 170000 180000>;
				active_hysteresis = <0 8000 8000 7000 7000
						0 0 0 0 0>;
			};
			cool {
				active_trip_temps = <0 35000 45000 53000 61000
						140000 150000 160000 170000 180000>;
				active_hysteresis = <0 8000 8000 7000 7000
						0 0 0 0 0>;
			};
		};
	};

May also adjust the setting for a try. The setting shall work for Xavier NX developer kit but making adjustment should be fine for debugging.

I think the setting is OK. Most of time, it works. But sometimes, the temperature is high but fan is not rolling. I suppose the driver code will check the temperature, when the temperature is over the threshold in table, it will set fan PWM. Maybe the driver code has issue, then the functionality is broken.

I want to find out the code to check the temperature then set fan PWM. Do you know where is the code?

Hi,
The driver code is

kernel/nvidia/drivers/thermal/pwm_fan.c
kernel/nvidia/drivers/misc/therm_fan_est.c

Thanks, @DaneLLL

I see pwm_fan_set_cur_state function in “pwm_fan.c”, this function should be used to set fan pwm. But I did not find where the code call it. It is pwm_fan_cooling_ops’s .set_cur_state. I also did not find who call this ops’s set_cur_state.

Hi,
The common driver is

kernel/kernel-4.9/drivers/thermal/thermal_core.c

And it calls the function cdev->ops->set_cur_state().

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.