I have built over 20 devices using the Jetson Nano and we found that one unit is switching off when ambient temperature is around 40oC. The other units can perform without issues up to ambient temperature of 50oC.
For the unit with this failure I replaced the heat sink for a new one, but still have the same issue. After replacing the Jetson Nano for a new one then the unit worked fine.
How can I know what it is the source of the problem on the Jetson Nano? Can you help me guide me to understand what is happening? Is there any specific file system log that provide the error information?
When I check the internal temperature using tegrastats the output values are in an acceptable range. In a device performing without issues, the device switch off when temperature rich values > 97oC, but in this device the working temperatures are on the 70s when the unit switch off. Here the results :
RAM 505/3956MB (lfb 704x4MB) SWAP 0/1978MB (cached 0MB) CPU [34%@1479,94%@1479,30%@1479,31%@1479] EMC_FREQ 0% GR3D_FREQ 15% PLL@69.5C CPU@77C PMIC@50C GPU@70C AO@78.5C thermal@73.5C
I am wondering if there are other components that can be affected by the temperature and not performing properly. How can I debug the unit to get more information about this problem?