How does tegra_fan/target_pwm integrate with thermal management?

What thermal management software runs by default on the TX2?

I notice that I can control the speed of the fan by writing a value between 0 and 255 into ‘/sys/kernel/debug/tegra_fan/target_pwm’.

I can also turn on more cores and raise clock speeds by using ‘nvpmodel -m 0’.

When running more cores, the temperature goes up to about 45 degrees, even when the TX2 is idle.

I measure this by catting all the devices in ‘/sys/devices/virtual/thermal/thermal_zone*/temp’

Is there any software that will look at the temperature of the different zones, and regulate the fan speed based on temperature?
If so, where does this software physically live? Is it a kernel module? A systemd service? What’s the name of that software? Does it have configuration files?

Or is the fan control totally manual, and if I want load-based fan speed, I have to write my own software that polls the important thermal zones and turns on/off the fan as apprporiate?

I’ve read the thermal management document, but I don’t remember any talk about the software bit, only about thermal interface, test points, thermal resistance and so forth (all very important for the physical/hardware side of things!)

The fan is controlled automatically by thermal management scheme. The thermal management will adjust the system performance by monitoring the value of thermal zones to decrease the temperature if needed, and so to keep system in safe. Those zones are some physical sensors in some dedicated places of chip.

Thanks for the answer, although it only answers half the question. The hardware bit of thermal is already well described in the thermal design guidelines document, including where the various hardware zone sensors are located. I’m asking about the behavior and configuration of the software part.

Where does this scheme live, and how does it use the parameters in /sys/devices/virtual/thermal/… ?
What happens when I change those various device parameters?

If I do “echo 0 > /sys/devices/virtual/thermal/…/target_pwm” will I prevent the thermal software from turning the fan on? Or will the thermal management at some point decide “the fan should be on” and update the value of target pwm?

When I use a third party carrier board, such as the Auvidea J90, the fan is by default constant-on, and there’s a GPIO that can turn it off. Is there some way to make this on/off controlled by the internal thermal management scheme? If so, how?

I’m sure you figured it out already, but I need this patch to make my J120 fan work correctly (the latest Auvidea dtbs have this included):

diff --git a/kernel-dts/t18x-common-platforms/tegra186-cvb-prod-p2597-b00-p3310-1000-a00-00.dtsi b/kernel-dts/t18x-common-platforms/tegra186-cvb-prod-p2597-b00-p3310-1000-a00-00.dtsi
index 972d701…c8ee2ce 100644
— a/kernel-dts/t18x-common-platforms/tegra186-cvb-prod-p2597-b00-p3310-1000-a00-00.dtsi
+++ b/kernel-dts/t18x-common-platforms/tegra186-cvb-prod-p2597-b00-p3310-1000-a00-00.dtsi
@@ -248,7 +248,8 @@
regulator-name = “vdd-fan”;
regulator-min-microvolt = <5000000>;
regulator-max-microvolt = <5000000>;

  •                   gpio = <&gpio_i2c_0_74 4 0>;
  •                   /*gpio = <&gpio_i2c_0_74 4 0>;*/
  •                   gpio = <&tegra_main_gpio TEGRA_MAIN_GPIO(J, 6) 0>;
              vdd_3v3: regulator@14 {

diff --git a/kernel-dts/t18x-common-platforms/tegra186-quill-pwm-fan.dtsi b/kernel-dts/t18x-common-platforms/tegra186-quill-pwm-fan.dtsi
index 1e2f654…05eb691 100644
— a/kernel-dts/t18x-common-platforms/tegra186-quill-pwm-fan.dtsi
+++ b/kernel-dts/t18x-common-platforms/tegra186-quill-pwm-fan.dtsi
@@ -29,7 +29,8 @@
state_cap = <7>;
active_pwm_max = <256>;
tach_period = <1000>;

  •           pwm_gpio = <&tegra_aon_gpio TEGRA_AON_GPIO(V, 6) GPIO_ACTIVE_LOW>; /* TEGRA_MAIN_GPIO_PV6 */
  •           /*pwm_gpio = <&tegra_aon_gpio TEGRA_AON_GPIO(V, 6) GPIO_ACTIVE_LOW>; // TEGRA_MAIN_GPIO_PV6 */
  •           pwm_gpio = <&tegra_aon_gpio TEGRA_AON_GPIO(J, 6) GPIO_ACTIVE_LOW>; /* TEGRA_MAIN_GPIO_PJ6 */
      pwm-fan {
              status = "okay";

The thermal management doc tells you how to modify your dts to change the trip points. If you want to experiment at run time, you can write to the trip points at /sys/devices/virtual/thermal/thermal_zone7