AGX Xavier won't boot after replacing SSD

Hi folks,

I bought my AGX Xavier kit from Seeed Studio here.

I then installed a nvme and move my rootfs there with these instructions, which went well.

As I needed some more space, I replaced the nvme with a bigger one. But now the device wont boot. It stopped here as shown in this picture.

.
After freezing and then rebooting for a few times, it would go to a black screen with a flashing cursor on the top left corner.

If I then removed the nvme, then the device would boot just fine. I have tried reflashing (jetpack 5.1.1 and 5.1) the kernel with

sudo ./flash.sh jetson-agx-xavier-devkit mmcblk0p1

and fixxing the bootloader with

sudo ./flash.sh -r -k cpu-bootloader jetson-agx-xavier-devkit mmcblk0p1

but it did not fix the problem.

Here’s the kernel flashing log.
log_flash.txt (451.1 KB)

It seemed somehow the device just decided to reject the new nvme completely and after flashing a few times, it would not accept the old one either.

Thanks so much in advanced.

Hi ttungnguyen2205,

It is a custom board for AGX Xavier.
Do you use the custom BSP package from Seeed Studio to flash your board?

Please also provide the serial console log to know what’s going on the board.

Hi KevinFFF,

I did try the custom packages . The custom 4.6.1 package, which originally came with the board, did work. However, the 5.x ones saw the same problem.

Also, this board does not come with a micro-USB port so I haven’t figured out how to dump the console log yet. Do you have any idea?

Is the 5.x also the custom BSP package from the vendor?

For our AGX Xavier devkit, there’s microUSB on the carrier board for serial console log.
For the custom board from Seeed Studio, you may ask the vendor for how to capture serial console log. Maybe they use TypeC-USB instead, you could check if any node appear in /dev of your host PC after you plug TypeC-USB cable.

Is the 5.x also the custom BSP package from the vendor?

Yes. 5.0.1 and 5.0.2 can be found on the vendor’s website.
And I want to note that even the official 5.1.1 package did work when I tried to upgrade from custom 4.6.1. I think upgrading broke my bootloader somehow. Is that at all possible?

Please just use latest JP5.1.1 from your vendor to flash the custom carrier board for AGX Xavier.
If there’s any boot up issue, please help to provide the serial console log for further check.

@KevinFFF
I mean the JP5.1.1 from the official nvidia page, the vendor until now has only released 5.0.2.
I have tried the vendor’s 5.0.1-2 and 4.6.1. Amongst those, the JP4.6.1 package worked.

But your suggestion did give me a hunch.
I went back and tried the old ssds that did work when I upgraded to JP5.1.1 (official nvidia), it turned out they worked for vendor’s jp5.0.2 too. So I read carefully the specs of my ssds, and found out that the new ones that didn’t work with jp5.x run on a 2.8A-current, and the ones that did run on 1.4A and 1.9A currents.

Let me sum that up.

OLD NVMe ssds

  • Voltage: 3.3V
  • Current: <1.9A
  • Work for 4.6.1 (vendor, official nvidia), 5.0.x (vendor) and 5.1.x (official nvidia)

NEW NVMe ssds

  • Voltage: 3.3V
  • Current: 2.8A
  • Work for 4.6.1 (vendor, official nvidia)

So my assumption now is that the kernel’s device tree of JP5.x somehow limited the supported current, so when a device with the operating current exceeding the threshold would make the kernel unable to boot.

Do you know how I could change that threshold in the device tree?

Hi, are you talking about custom design? If so, you should check with vendor for the carrier board design as the NVMe connector is on carrier board and the 3.3v supply is from carrier board too. Generally 2.8A should be no problem as far as I know.

Hi @Trumany
Both my old and new nvme ssds use 3.3V. And the 2.8A ones work when I flash the board with JP4.6.1 and dont when I use JP 5.x instead. So it doesn’t seem a hardware problem.

I have also posted a similar question on Seeed Studio’s forum but so far no answers.

Could you provide the flash logs of both JP4.6.1 an JP5.1.1 with your “NEW NVMe SSD” for further check?

Is this still an issue to support? Any result can be shared? Thanks

@KevinFFF @kayccc I apologize for the late reply.

Here are the flash logs with different Jetpack versions from both the vendor and the official nvidia website.
flash_log_jp511_official_nvidia.txt (453.8 KB)
flash_log_jp502_official_nvidia.txt (466.1 KB)
flash_log_jp461_official_nvidia.txt (428.1 KB)
flash_log_jp461_custom_drivers.txt (422.8 KB)
flash_log_jp501_custom_drivers.txt (81.6 KB)

I’ve checked these 5 logs, it seems all flash with successful.

Do you flash to “New NVMe SSD” for all of these 5 flash logs?
Could you help to provide the flash command you use?

All of them with the new NVMes. Flashing went fine until the first boot. And the weird thing is if I removed the new NVMes, the board would work again.

I used a standard flash command:

sudo ./flash.sh jetson-xavier mmcblk0p1

What do you mean about “until first boot”?
Is there any error during further boot?

This command is used for flashing the internal eMMC of AGX Xavier.

Would you like to use internal eMMC or external NVMe SSD as your boot device (rootfs)?

What do you mean about “until first boot”?
Is there any error during further boot?

I mean the first boot after flashing. That’s when the problem first appears and it persists for all subsequent boots if the new NVMes are plugged in.

This command is used for flashing the internal eMMC of AGX Xavier.
Would you like to use internal eMMC or external NVMe SSD as your boot device (rootfs)?

Of course I would want to boot from NVMe. But either way, there should be no problem if I flash the eMMC right. But I did try flashing NVMe and it’s the same problem.

Is there any log when you get the issue?

It is necessary to flash the eMMC correctly due to some bootloader partitions inside it. What’s the problem with our New NVMe drive? We would need to check the serial console log.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.