Inconsistent flashing with initrd

Hi,
I currently having a really hard time flashing JP4.6 or JP5.1.1 to a Xavier NX using the initrd script. A couple of times it worked for JP5.1.1 but its really flaky, have not been able to get it to work at all on JP4.6.
I want to flash nvme, the diferent attempts:

  1. Worked a couple of times on JP5.1.1:
    First:
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --external-device nvme0n1p1 -c ./tools/kernel_flash/flash_l4t_external.xml --showlogs jetson-xavier-nx-devkit-emmc nvme0n1p1

Then:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --external-device nvme0n1p1 -c ./tools/kernel_flash/flash_l4t_external.xml --showlogs jetson-xavier-nx-devkit-emmc nvme0n1p1

Sometimes doesnt work, doesnt matter to use a custom kernel/dtb, with a default enviroment and rootfs it still fails.

  1. Didnt worked, have tested only on JP4.6:
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash jetson-xavier-devkit-emmc internal

Tested with:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --external-device nvme0n1p1 -S 2100000000 -c ./tools/kernel_flash/flash_l4t_external.xml --external-only --append jetson-xavier-devkit-emmc external

And also with this other one, to just test if the emmc could get flashed

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --flash-only jetson-xavier-devkit-emmc internal
  1. SDKmanager, also worked like once, havent tried it more with that one, since the client has a production line and needs the flashing to be done via a simple cmd.

Errors:
With both:

***************************************
*                                     *
*  Step 3: Start the flashing process *
*                                     *
***************************************
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh .......................................................................................................................................................................................Timeout
Cleaning up...

Only with JP5.1.1:

Waiting for device to expose ssh …RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh …Timeout
Cleaning up…

Other details:

  • I have been using the usb flashing port that is available.
  • Cannot hook up a usb to check serial(custom carrier).
  • The normal flash to emmc is realiable and works fine.
  • Also tried formatting the nvme to leave it clean.
  • Have tried flashing default JP to emmc with the SDKmanager and then do the initrd to nvme(failed as well).
  • Can plug a screen during the initrd flash, getting only a blinking cursor, doesnt allow input, only ctrl+alt+del to exit.
  • Tried starting and stopping udisks2.service

Regards,
Andres
Embedded SW Engineer at RidgeRun
Contact us: support@ridgerun.com
Developers wiki: https://developer.ridgerun.c om/
Website: www.ridgerun.com

I can’t help, but I want to ask something relevant: Does the failed initrd boot still boot if you then reset power (maybe a couple of times)? Or does the boot fail from then on? I’m interested in separating whether (A) it is content flashed which fails, or (B) the same content might work part of the time. Incidentally, I can’t tell you where the flashed initrd itself is at, but if it is the content itself which is failing, then you might want to post two copies of the initrd: The working copy and the failing copy. Myself or someone else could look at that content and see where it differs (and knowing which part of initrd fails is a big clue).

The problem here is this one. Without log, it is hardly to tell what happened.

Cannot hook up a usb to check serial(custom carrier).

Initrd flash takes device tree as parameters. It is unlike flash.sh.

Flash.sh won’t care about the device tree. It is just using default route to flash your board.
However, initrd flash will flash bootloader to your board first, and then use initrd to flash your board for specific usecase (e.g. nvme boot/ usb boot).

This “use initrd to flash your board” will use device tree and something are mandatory here.

  1. It must have usb device mode working to send data from host side to device side. For example, if your custom board change the design of usb part and there is no longer usb device mode, then initrd flash won’t work. Most likely you will see timeout error.
  2. It must be able to find out the external drive you want to flash. For example, if your nvme on pcie has some problem which cannot be detected, then it will hit timeout error too.

Since initrd is similar to kernel behavior, it almost means you can check above behaviors in kernel first.
For example, if your nvme ssd is not able to get detected even in kernel, then it won’t work in initrd either.

Thus, w/o log, not much to help. You can try to use same case on devkit first and see what log got printed if there is any error.

Hi, thanks for both replies.
First @WayneWWW , I can confirm the nvme gets detected, I can get into normally flashed Jetpack and lsblk’d and can see it. Also the nvme flash worked like two times with the same hw and the custom kernel/dtb, but only on JP5.1.1. The devkit carrier is a good idea, will get one and try to get the logs from it.
@linuxdev, So after the l4t_initrd_flash.sh script loads the temp fs, I think it boots, since I get display and a blinking cursor, however I’m not sure if that’s a successful boot, it is doing something, since I can plug a keyboard and hold ctrl-alt-del and it reboots to the old JP that was loaded on emmc. I can tell you that if I intentionally mess something up on the dtb I can get the kernel to crash and can see the crashing logs on the display after the script loads the temp fs. And if I cycle power, the JP that was on emmc will load, nothing gets altered.

Regards,
Andres

Also, @WayneWWW maybe you can help me to check if what I’m triying to do while I get a devkit carrier can work. I saw that I can make the internal image and external image separately with:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash jetson-xavier-devkit-emmc internal

And

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --external-device nvme0n1p1 -S 2100000000 -c ./tools/kernel_flash/flash_l4t_external.xml --external-only --append jetson-xavier-nx-devkit-emmc external

And then I should be able to load an image without any modifications first and use it to actually flash the image with the modifications, but it didnt work. I could see that the default kernel was loading first since there are some lights that only come on with the custom image, and they were off, but I’m not sure if I’m missing any other parameter on the script or something like that.

Does it have wired ethernet? If so, do you have a means to detect if DHCP traffic runs? Or do you have access to router logs such that you can see the MAC address and any address assignment? Knowing if that occurs would say if boot itself succeeds (DHCP won’t request an address if the Linux kernel does not load…unless you’ve modified boot). Knowing if an IP address is available for ssh would be a big bonus to debugging.

Don’t know if this will help, but are you familiar with Magic SysRq? If used over serial console it has to use echo to the right “/proc” file, but if USB has a keyboard directly connected, then you could gain some control over shutdown function. Example:

# sync twice:
ALT-SYSRQ-s
ALT-SYSRQ-s
# go to read-only mode:
ALT-SYSRQ-u
# Force boot/shutdown:
ALT-SYSRQ-b

The above won’t help much, and not all Magic SysRq combinations will be enabled, but perhaps the “ALT-SYSRQ-g” is available to go to framebuffer console. I have no idea if this works on Jetsons, or if it is enabled. Maybe useful in some cases even though you cannot see serial console, nor monitor.

Incidentally, if root uses echo of various keys to “/proc/sysrq-trigger”, this also works over serial console (sadly it seems you don’t have serial console). Example of alternate method for anyone interested (from over serial console):

# sync twice:
sudo -s
echo s > /proc/sysrq-trigger
echo s > /proc/sysrq-trigger
# Go to read-only:
echo u > /proc/sysrq-trigger
# Force reboot:
echo b > /proc/sysrq-trigger

@linuxdev , seems its not enabled, doesnt do anything when I hit those keys. It does have ethernet will check the router logs to check if its online

Unfortunately the router doesnt report that there is a new connection, seems something is causing the network to not go up. Will verify once I get serial. The weird thing is that it happends with the custom rootfs with custom dtb/kernel and also with a fresh rootfs from the Tegra_Linux_Sample-Root-Filesystem_R35.3.1_aarch64.tbz2 only with this changes:

sudo ./apply_binaries.sh
sudo ./tools/l4t_create_default_user.sh -u user -p root

It’s really not possible to tell what is wrong by just reading your flash command or host side log.

If you ever saw every kinds of initrd flash issue reported on this forum, you would know the flash command won’t provide much info.

I’d agree that without the serial console log there isn’t much possible. One exception might be due to the fact that you are using an NVMe drive. You could put that drive on a second computer, and look at logs in “/var/log”. If the Linux kernel did not load, no logs will be there.

@WayneWWW, Hi I’m back with logs now.
What I did:

  1. Flashed default JP5.1.1 on emmc using sdkmanager.
  2. Flashed default JP5.1.1 on nvme target using sdkmanager, got error, same
    …RTNETLINK answers: File exists, for a while and then timeout.
  3. Took nvme out to format and clear all partitions. Tried again step 2, same error.
    Here are the logs from serial, the host side logs, are the same as the beginning of the post.
    log_default_jp_sdkm.log (66.3 KB)

Some weird things:

  • Again it worked twice and after, only fails. Worked the first time no problem with the nvme and emmc that had an old JP 4.X installed. Worked the second time flashing the custom kernel and dtb.
  • For the next tries JP/Kernel/dtb are default, even deleted the sdkm dowloads to be sure all was fresh.
  • Using a diferent carrier and SoM than the first one, so probably not hw related.

Regards,
Andres

PS, @linuxdev, the nvme is empty, no partitions, nothing, so no /var/log.

Is this test done on your board or nv devkit?

Its on this carrier, I dont have a devkit readily available but, the carrier is able to start and work normally with default JP, without their bsp, at least the ethernet, usb flash port and serial port work fine. Also nvme gets properly detected, and on the first two attemps was able to get flashed and loaded properly afterwards.

Does usb device mode launch and able to to ssh between your host and jetson?

Do you mean the 192.168.55.1 via usb? On this carrier I cant, but on the first carrier I can.

Just to be sure, does the l4t_initrd_flash script does fuse flashing? 'cause its weird that on the two boards I was able to flash nvme two times, and after that I’m not able to do it following the exact process.

Hi,

Could you prevent to use something like “this” or “first”? It would be easier to check by directly telling the board name… For example, ConnectTech board.

It looks like you don’t have a stable board that can really use initrd flash. I mean the problem on your connectTech board may not be same as what you hit on your custom board. ConnectTech board does not have usb device mode working fine so that initrd flash failed. Your custom can have usb device mode but there is still some other error but you cannot dump uart log so no one can really know what happened.

after that I’m not able to do it following the exact process.

After what?

Hi, yes sorry for that. Lets focus on the connectTech carrier. I was able to flash it two times, one with default JP5.1.1, from JP4.X, and then to my custom JP5.1.1 with custom dtb and kernel img. However now I’m not able to flash it with neither. The logs are se same ones that I attached, added them here as well.
log_default_jp_sdkm.log (66.3 KB)
The commands that I’ve used are:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --external-device nvme0n1p1 -c ./tools/kernel_flash/flash_l4t_external.xml --showlogs jetson-xavier-nx-devkit-emmc nvme0n1p1

Then:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --flash-only --external-device nvme0n1p1 -c ./tools/kernel_flash/flash_l4t_external.xml --showlogs jetson-xavier-nx-devkit-emmc nvme0n1p1

Those commands worked the first two times, but now I get only the error:

…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
…RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh …Timeout
Cleaning up…

If it helps, I can get into the console of the initrd image, using serial after it gets loaded, so the initrd image seems to be loading fine.

Hi,

Just to clarify. It is ConnecTech’s responsibility to fix this usb issue (yes, this is usb issue now) but not you. Also, if you are not engineer from ConnectTech, then you may not provide the info we need here (e.g. the board schematic).

The situation of ConnectTech board is already clarified. You cannot see the usb device network interface from your host side. Initrd flash is using this interface to flash data to nvme.

Got it, is there a way to use another normal network or another interface instead of the usb one?