Hello,
At my job, we have been using Jetson Nano 2GB boards to prototype one of our projects. One of our units has been having an issue recently, and I was wondering if anyone could help me understand what might be causing the issue.
For context, our units run a Yolo object detection network, along with a python script that uses opencv to prepare camera frames for our model. We have these prototype units installed at a few of our clients locations, so we can verify our models performance, and gather more video footage to strengthen our model. The issue we have been facing over the past month is that one of our units keeps losing power, even though it is still plugged in to the wall outlet. Pulling the power cable out of the unit and plugging it back will make the unit boot back up, but only temporarily; the unit will lose power after a few days/a week. I did some digging into the kern and syslog, and noticed in the kernel log a lot of these “soctherm: OC ALARM 0x00000001” warnings. In the kernel log, it looks like a new instance of this warning is thrown every second, up until the point where the box loses power again. I looked in the forums, and it looks like this error is telling me that this unit is receiving low voltage? If that is the case, I am trying to figure out what the point of failure could be, whether it is the power supply we are using, or the outlet our client has the unit plugged in to? For reference, we are using this canakit usb c power supply.
Overall, I’m really want to get all the information I can before I troubleshoot this unit with our client. I am wondering if anyone has more information about what this error is signifying, and if it is related to under voltage, what the likely point of failure is (bad power supply, bad outlet, a program is using too much cpu resources, etc.).