TK1 Temperature issue

Hello!!

Thank for viewing my Topic. XD

Now using Tegra Tk1 and r21.3 version.

And Yesterday I had a problem.

I want to Test TK1 module on high tenmperature

But it is forced stop with message
thermal thermal_zone0: critical temperature reached(105 C),shutting down

And When input command sensors

CPU-therm-virtual-0
Adapter: Virtual device
temp1: +92.5°C (crit = +105.0°C)

GPU-therm-virtual-0
Adapter: Virtual device
temp1: +92.0°C (crit = +101.0°C)

MEM-therm-virtual-0
Adapter: Virtual device
temp1: +92.0°C (crit = +101.0°C)

PLL-therm-virtual-0
Adapter:
Virtual device
temp1: +92.5°C

I find about that

  1. /drivers/gpu/drm/nouveau/core/subdev/therm/temp.c
32 static void
 33 nouveau_therm_temp_set_defaults(struct nouveau_therm *therm)
 34 {
 35     struct nouveau_therm_priv *priv = (void *)therm;
 36
 37     priv->bios_sensor.offset_constant = 0;
 38
 39     priv->bios_sensor.thrs_fan_boost.temp = 90;
 40     priv->bios_sensor.thrs_fan_boost.hysteresis = 3;
 41
 42     priv->bios_sensor.thrs_down_clock.temp = 95;
 43     priv->bios_sensor.thrs_down_clock.hysteresis = 3;
 44
 45     priv->bios_sensor.thrs_critical.temp = 105;
 46     priv->bios_sensor.thrs_critical.hysteresis = 5;
 47
 48     priv->bios_sensor.thrs_shutdown.temp = 135;
 49     priv->bios_sensor.thrs_shutdown.hysteresis = 5; /*not that it matters */
 50 }
  1. arch/arm/mach-tegra/tegra11_soctherm.c
2650 static struct soctherm_fuse_correction_war t14x_fuse_war[] = {
2651     [TSENSE_CPU0] = { 1149000, -16753000 },
2652     [TSENSE_CPU1] = { 1148800, -16287000 },
2653     [TSENSE_CPU2] = { 1139100, -12552000 },
2654     [TSENSE_CPU3] = { 1141800, -11061000 },
2655     [TSENSE_MEM0] = { 1082300, -11061000 },
2656     [TSENSE_MEM1] = { 1061800,  -7596500 },
2657     [TSENSE_GPU]  = { 1078900, -10480000 },
2658     [TSENSE_PLLX] = { 1125900, -14736000 },
2659 };
2660
2661 /* old CP/FT */
2662 static struct soctherm_fuse_correction_war t12x_fuse_war1[] = {
2663     [TSENSE_CPU0] = { 1148300, -6572300 },
2664     [TSENSE_CPU1] = { 1126100, -5794600 },
2665     [TSENSE_CPU2] = { 1155800, -7462800 },
2666     [TSENSE_CPU3] = { 1134900, -6810800 },
2667     [TSENSE_MEM0] = { 1062700, -4463200 },
2668     [TSENSE_MEM1] = { 1084700, -5603400 },
2669     [TSENSE_GPU]  = { 1084300, -5111900 },
2670     [TSENSE_PLLX] = { 1134500, -7410700 },
2671 };
2672
2673 /* new CP1/CP2 */
2674 static struct soctherm_fuse_correction_war t12x_fuse_war2[] = {
2675     [TSENSE_CPU0] = { 1135400, -6266900 },
2676     [TSENSE_CPU1] = { 1122220, -5700700 },
2677     [TSENSE_CPU2] = { 1127000, -6768200 },
2678     [TSENSE_CPU3] = { 1110900, -6232000 },
2679     [TSENSE_MEM0] = { 1122300, -5936400 },
2680     [TSENSE_MEM1] = { 1145700, -7124600 },
2681     [TSENSE_GPU]  = { 1120100, -6000500 },
2682     [TSENSE_PLLX] = { 1106500, -6729300 },
2683 };
2684
2685 /* old ATE pattern */
2686 static struct soctherm_fuse_correction_war t13x_fuse_war1[] = {
2687     [TSENSE_CPU0] = { 1119800,  -6330400 },
2688     [TSENSE_CPU1] = { 1094100,  -3751800 },
2689     [TSENSE_CPU2] = { 1108800,  -3835200 },
2690     [TSENSE_CPU3] = { 1103200,  -5132100 },
2691     [TSENSE_MEM0] = { 1168400, -11266000 },
2692     [TSENSE_MEM1] = { 1185600, -10861000 },
2693     [TSENSE_GPU]  = { 1158500, -10714000 },
2694     [TSENSE_PLLX] = { 1150000, -11899000 },
2695 };
2696
2697 /* new ATE pattern */
2698 static struct soctherm_fuse_correction_war t13x_fuse_war2[] = {
2699     [TSENSE_CPU0] = { 1126600, -9433500 },
2700     [TSENSE_CPU1] = { 1110800, -7383000 },
2701     [TSENSE_CPU2] = { 1113800, -6215200 },
2702     [TSENSE_CPU3] = { 1129600, -8196100 },
2703     [TSENSE_MEM0] = { 1132900, -6755300 },
2704     [TSENSE_MEM1] = { 1142300, -7374200 },
2705     [TSENSE_GPU]  = { 1125100, -6350400 },
2706     [TSENSE_PLLX] = { 1118100, -8208800 },
2707 };
  1. drivers/platform/x86/acerhdf.c
63 #define ACERHDF_TEMP_CRIT 89000       // 95000

[b]But work well only number 3…

I want to know How to change temperature limit value about
CPU/GPU/MEM crit, limit and value 105C in “critical temperature reached(105 C)” message
[/b]

These limit values should not be changed as they are for chip protection

Thank you for answer, But I want to test TK1 module on high temperature.

I’d test that if i change in “/sys/devices/system/cpu/cpu0/cpufreq/scailing_max_freq”
file value to down CPU clock.

When clock value 2218500 -> 1734000 changed, it is more work on 86C -> 92C.
( It is test chamber temperature not Tk1 sensor temperature. )

But if CPU Clock down… Whole processing speed is down too. so it is not i want…

You told me limit value is can not changed.

So is there something method to down temperature when Tk1 module is working on high temperature?

Hardware is fixed so i should find to method in software.

So can down temp only by reducing performance if HW is fixed which means can’t optimize the thermal dissipation, right?
TK1 will automatic degrade performance to protect chip when temperature reach the limits.

[b]
Is there anything different option to down temp by not reducing performance?

If it is not possible to down temp in software,I will change HW format to restart project.

[/b]

No such SW way, suggest to enhance HW thermal design

Thx !!