Xavier NX Randomly reboots

Hi I purchased a Xavier NX and it wouldn’t boot. I replaced it and got a new one but am having similar issues (i.e. it randomly reboots) and took more than 2h to boot the first time. I would be surprised if I got 2 faulty Xaviers … but am not sure why this is happening and would love to receive some help here.

Thank you!

Could you monitor the serial console log and see what is printed right before the reboot?

https://elinux.org/Jetson/General_debug

Hi, we have similar issues with some Xavier NX. It often happens while pulling large image layers (> 660MB) from Docker Hub, mostly on the extraction step. Unfortunately, nothing is printed in the serial console, it just reboots. The SD card has enough space and there is also enough memory free.

Also we encountered the problem on multiple devices, so I don’t think that all are faulty.

@jeremyfraenkel: Could you solve the issue with you devices?
@WayneWWW: Any idea what could be the reason for this and how we can further debug the issue?

Hi,

If it does not print anything, then this is probably an issue in power supply.

Are you reproducing this issue on NV devkit?

Hi WayneWWW,

yes, it is happening on the Jetson Xavier NX devkit. Maybe it has something to do with the SD card. We experience the issue mostly with the Samsung EVO Plus 32GB (MB-MC32GA/EU) SD card. We now tried the WD Purple 32GB (WDD032G1P0C), which seems to behave better.
But, if it would be the SD card, I would expect some error output on the serial console. We also checked if there are voltage drops from the the power supply, but it looks quite well.

Do you have any idea how to further debug this?

If this is NX devkit, are you able to reproduce this issue in every NX you have?

Could you share how to reproduce your error if I want to try on my device?

We can reproduce this issue with multiple devkits equipped with the Samsung EVO Plus 32Gb. With devkits equipped with the WD purple, we didn’t see this problem yet.

As mentioned before, it happens while pulling large docker images. It doesn’t happen all the time, so maybe you have to repeat it. For example, you could run the following commands multiple times:

docker pull 3dvl/hemistereo-base-dev:jetson-xavier-nx-r32.4-latest
docker system prune -a

Mostly on the extraction step of some layers, the device reboots.

Just speculation: I think the 970 Evo Plus is a very fast SSD, and the amount of power it requires will depend on what operations it is performing, especially write bandwidth. Do you happen to know if the SSD is running extensive operations at the moment of reboot? Perhaps it is just pulling more power and manages to use a bit too much power.

Hi linuxdev,
it happens with the Samsung EVO Plus SD card, not SSD.