Hello,
we are using a Jetson Xavier which is confronted with lots of GPU and some CPU load.
After a few hours, it suddenly switches off.
Below you can see tegratats output until the end (interval 5 seconds):
RAM 11306/15700MB (lfb 9x1MB) CPU [98%@2265,26%@2265,36%@2265,30%@2265,27%@2265,26%@2265,25%@2265,34%@2265] EMC_FREQ 62%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@69.5C thermal@67.5C PMIC@100C GPU 19807/17759 CPU 6106/4963 SOC 5510/5251 CV 0/0 VDDRQ 2977/2877 SYS5V 4654/4576
RAM 11306/15700MB (lfb 8x1MB) CPU [98%@2188,34%@2188,32%@2220,27%@2020,29%@2062,30%@2265,28%@2265,25%@2265] EMC_FREQ 59%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 1% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@69.5C thermal@67.2C PMIC@100C GPU 18341/17760 CPU 5813/4964 SOC 5216/5251 CV 0/0 VDDRQ 2830/2877 SYS5V 4574/4576
RAM 11307/15700MB (lfb 7x1MB) CPU [97%@2265,37%@2265,27%@2265,30%@2265,28%@2221,29%@2188,27%@2188,26%@2145] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 0% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@69C thermal@67.35C PMIC@100C GPU 19219/17760 CPU 5810/4964 SOC 5363/5251 CV 0/0 VDDRQ 2979/2877 SYS5V 4614/4576
RAM 11307/15700MB (lfb 7x1MB) CPU [97%@2265,30%@2265,31%@2265,32%@2265,26%@2265,26%@2265,31%@2265,28%@2265] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@70C thermal@67.5C PMIC@100C GPU 19219/17761 CPU 6108/4964 SOC 5361/5251 CV 0/0 VDDRQ 2978/2877 SYS5V 4654/4576
RAM 11308/15700MB (lfb 7x1MB) CPU [97%@2265,29%@2265,30%@2265,33%@2265,26%@2265,30%@2265,27%@2265,26%@2265] EMC_FREQ 58%@2133 GR3D_FREQ 90%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.5C AUX@63C CPU@69.5C thermal@67.35C PMIC@100C GPU 18043/17761 CPU 5813/4965 SOC 5216/5251 CV 0/0 VDDRQ 2832/2877 SYS5V 4574/4576
RAM 11307/15700MB (lfb 7x1MB) CPU [97%@1410,29%@1881,30%@2035,34%@2244,29%@2188,26%@2265,26%@2265,30%@2265] EMC_FREQ 62%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 1% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@69C thermal@67.65C PMIC@100C GPU 19956/17761 CPU 5810/4965 SOC 5510/5251 CV 0/0 VDDRQ 2977/2877 SYS5V 4654/4576
RAM 11307/15700MB (lfb 7x1MB) CPU [96%@2265,31%@2265,31%@2265,28%@2265,29%@2265,31%@2265,27%@2026,27%@2079] EMC_FREQ 62%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@69.5C thermal@67.65C PMIC@100C GPU 19517/17762 CPU 5957/4965 SOC 5510/5251 CV 0/0 VDDRQ 2978/2877 SYS5V 4654/4576
RAM 11307/15700MB (lfb 7x1MB) CPU [97%@2265,31%@2265,32%@2265,29%@2265,28%@2265,29%@2265,28%@2265,28%@2265] EMC_FREQ 58%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 1% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@69C thermal@67.35C PMIC@100C GPU 17894/17762 CPU 5664/4966 SOC 5216/5251 CV 0/0 VDDRQ 2832/2877 SYS5V 4574/4576
RAM 11304/15700MB (lfb 8x1MB) CPU [97%@2265,34%@2265,31%@2265,29%@2265,26%@2265,27%@2265,28%@2265,30%@2254] EMC_FREQ 60%@2133 GR3D_FREQ 87%@1377 APE 150 MTS fg 3% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@70C thermal@67.65C PMIC@100C GPU 18921/17763 CPU 5810/4966 SOC 5363/5251 CV 0/0 VDDRQ 2978/2877 SYS5V 4654/4577
RAM 11305/15700MB (lfb 8x1MB) CPU [98%@2265,31%@2265,33%@2265,29%@2265,33%@2265,27%@2265,26%@2265,27%@2265] EMC_FREQ 61%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@70C thermal@67.65C PMIC@100C GPU 18632/17763 CPU 5959/4966 SOC 5363/5251 CV 0/0 VDDRQ 2830/2877 SYS5V 4574/4576
RAM 11305/15700MB (lfb 8x1MB) CPU [98%@2265,35%@2265,34%@2265,28%@2265,28%@2265,26%@2265,28%@2265,27%@2265] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@71.5C Tboard@62C Tdiode@65.75C AUX@63C CPU@69.5C thermal@67.5C PMIC@100C GPU 18333/17763 CPU 5813/4966 SOC 5216/5251 CV 0/0 VDDRQ 2830/2877 SYS5V 4574/4576
RAM 11306/15700MB (lfb 8x1MB) CPU [98%@2265,30%@2265,36%@2265,28%@2265,27%@1981,28%@1876,27%@2265,30%@2188] EMC_FREQ 62%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@70C thermal@68C PMIC@100C GPU 19807/17764 CPU 6106/4967 SOC 5510/5251 CV 0/0 VDDRQ 2977/2877 SYS5V 4654/4577
RAM 11305/15700MB (lfb 8x1MB) CPU [98%@2265,30%@2265,31%@2265,27%@2265,31%@2265,29%@2265,27%@2265,29%@2265] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@65.75C AUX@63C CPU@69.5C thermal@67.5C PMIC@100C GPU 18930/17764 CPU 5661/4967 SOC 5214/5251 CV 0/0 VDDRQ 2830/2877 SYS5V 4574/4577
RAM 11307/15700MB (lfb 8x1MB) CPU [99%@2116,31%@1420,34%@2265,31%@2265,26%@2265,27%@2265,31%@2265,30%@2265] EMC_FREQ 61%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66.5C GPU@72.5C Tboard@62C Tdiode@66C AUX@63C CPU@69C thermal@67.95C PMIC@100C GPU 19219/17765 CPU 5810/4967 SOC 5363/5251 CV 0/0 VDDRQ 2978/2877 SYS5V 4614/4577
RAM 11308/15700MB (lfb 8x1MB) CPU [99%@2265,27%@2265,38%@2259,34%@2260,27%@2265,30%@2265,27%@2265,26%@2265] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 1% bg 0% AO@66C GPU@72.5C Tboard@62C Tdiode@66C AUX@63.5C CPU@70C thermal@67.85C PMIC@100C GPU 19368/17765 CPU 6106/4968 SOC 5361/5251 CV 0/0 VDDRQ 2977/2877 SYS5V 4654/4577
RAM 11308/15700MB (lfb 8x1MB) CPU [99%@2213,34%@2244,34%@2265,33%@2265,26%@2153,26%@1728,27%@2040,26%@2166] EMC_FREQ 60%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66.5C GPU@72.5C Tboard@62C Tdiode@66C AUX@63.5C CPU@69.5C thermal@67.85C PMIC@100C GPU 19368/17766 CPU 5808/4968 SOC 5361/5251 CV 0/0 VDDRQ 2978/2877 SYS5V 4614/4577
RAM 11308/15700MB (lfb 8x1MB) CPU [99%@2265,29%@2265,33%@2265,35%@2265,28%@2265,24%@2265,26%@2265,28%@2265] EMC_FREQ 61%@2133 GR3D_FREQ 99%@1377 APE 150 MTS fg 2% bg 0% AO@66C GPU@72C Tboard@62C Tdiode@66C AUX@63.5C CPU@70C thermal@68C PMIC@100C GPU 19658/17766 CPU 6106/4968 SOC 5510/5251 CV 0/0 VDDRQ 2977/2878 SYS5V 4654/4577
Is it being switched off because of a high temperature in one sensor?
I checked the thermal design guide and the temperatures seem to be fine. I did not find a max value for the Tdiode?
Thank you very much!