I’m using a custom board with an OrinNX 16GB with Jetpack 6DP SDK 36.2 for a few months now.
When I try to flash the latest Jetpack 6.1 SDK 36.4 the flashing process fails because it is not able to detect the nvme disk.
This is my flashing command and the log: flash.log (278.3 KB)
I’ve ssh into the initrmfs and collect the dmesg: dmesg.log (34.2 KB)
PS: why is the kernel of SDK 36.4 not printing any information while booting during the flashing process?
The same flashing comand in SDK36.2 is working and the kernel printing all the booting information.
It seems you used above flash command.
May I know why you add --no-systemimg?
Please also try using -c tools/kernel_flash/flash_l4t_t234_nvme.xml instead of -c tools/kernel_flash/flash_l4t_external.xml for partition layout file in your command.
But the issue seems to be that the nvme is not detected when I ssh into the initramfs for flashing to nvme.
We develop our own vendor board in house, so we do not have a vendor, but since our board is working very well with the SDK36.2 and same flashing procedure, I tend to think the issue is not the custom board but rather something that have changed between SDK36.2 and SDK36.4.
Do you have some insights about which kernel modules or pieces of dtb we should look at to find out why the disk is not being recognized?
Kernel modules like nvme and nvme_core are loaded.
Could you be so kind to please check the dmesg log I attached, mostly in the PCI part and let me know if you see something different or unexpected?
I’ve even tried with different brands of nvme disks.
If you still have flash issue, please share the full flash log for further check.
R36.2 is a developer preview release which is not an expected release for production.
I remember the device tree loading is different.
Please check your /boot/extlinux/extlinux.conf to confirm which DTB in use (FDT entry).
Error: Could not stat device /dev/nvme0n1 - No such file or directory.
Flash failure
Either the device cannot mount the NFS server on the host or a flash command has failed. Check your network setting (VPN, firewall,...) to make sure the device can mount NFS server. Debug log saved to /tmp/tmp.sCTWaeGaBc. You can access the target's terminal through "sshpass -p root ssh root@fc00:1:1:0::2"
Cleaning up...
The problem is not with the nfs server. I’ve tried sshing into the initramfs and the disk nvme is not being detected by the kernel.
What puzzles is why with this SDK 36.4 kernel is not debug logging. I do have the suspicion that somehow the PCI is not detecting the nvme because the right dtb is not being used when booting for flashing.
I’ve managed to dump the dtb in use in the initramfs when the flashing fails.
I attached it here converted to dts. initramfs_dtb_extracted.txt (332.0 KB)
We tried different nvmes brands but still no good luck. SDK36.4 is not able to recognized the nvme but SDK36.2 does.
In another attempt to understand why the nvme is not recognized I dumped the dtb from the initramfs boot before flashing in the devkit and in our custom carrier board and I found one interesting difference:
Yes, I’m sure this is a full log.
I also found it quite interesting, because the kernel is not printing debugging information at all.
Do you have any idea what could be causing this?
What I see different from the booting process with jetpack 36.2 and 36.4 is that there is no kernel debugging information and the nvme is not recognized.
If you want to see the kernel information you can also have a look in the dmesg output I have added to the original message.
The dmesg information was collected using ssh to access the orin after the flashing process failed
Just to clarify. So what is the exact situation in your serial console?
You saw the UART log start to print and suddenly it got stopped ? And after it stopped for a while it gave you the bash-5.1 initrd console to operate the board again?
When flashing in the custom board the serial never reaches the point of the bash-5.1 initrd console. That happens only when flashing the devkit.
I was working under the impression that maybe the lack of EEPROM of our custom board makes the flashing process use a wrong dtb. The wrong dtb is not configuring the uart serial port and the PCI used by the NVME.
There is no such thing that lacking of carrier board EEPROM would affect UART or PCIe.
You can just do the test on rel-36.2 or rel-36.3. If the serial console log won’t get stuck in those version, then it matches to what I said.
Please be aware that there are lots of custom board in this forum from other users everyday. No one ever got this stuck point just because they don’t have EEPROM on their board. Actually, none of the custom board has the EEPROM on it.