we’re having an issue with some of our Jetson AGX Xaviers where they switch off unexpectedly.
We are fairly confident that we can rule out temperature as a possible cause, because we are constantly monitoring that and don’t see any instances where a xavier reaches any of the specified thresholds before switching off.
My questions:
Is the voltage at which the xavier turns off configurable, or is it a fixed limit defined by the hardware?
Should there be any kind of log entry (dmesg / journal / debug uart / etc.) before switching off?
Unfortunately, the journal entry order is garbled after turning it on again, because the xavier has no RTC battery and the systemd version that comes with ubuntu / jetpack is too old to deal with that.
The portion of the circuit that asserts VDDIN_PWR_BAD_ N when power is removed is designed to start the shutdown as soon as a voltage drop of ~0.5V is detected.
Over what time period does the 0.5 V drop have to occur in order to trigger shutdown?
We have also tried reading /sys/devices/platform/c360000.pmc/reset_reason after turning it on again, but it always returns SYS_RESET_N.
We are currently running all our Xaviers on 15 V. Our devices are currently in multiple locations throughout Europe and the failures are infrequent. It would be impractical to send someone to rewire all of them, especially since we don’t know the cause of the issue for sure.
Over what time period does the 0.5 V drop have to occur in order to trigger shutdown?
What you should do is to probe signal with oscilloscope first to check if the voltage droop (>0.5v) exists and if the VDDIN_PWR_BAD_N is asserted no matter what the period is.
If I understand the manual correctly, that is one of the pins connecting the Xavier AGX module to the carrier board. Is this pin reachable on the devkit without destroying it?
Yes, it is a pin on carrier board and can be reached. Please ask your hardware engineer to check that based on carrier board schematic and layout in P2822 package