Jetpack 4.5.1, TX2, BUG : FDT selected file loaded incorrectly by uboot

Hello, I have flashed a Jetson TX2 installed on a devkit with jetpack-4.5.1, and although it seems to work, I get strange messages coming from uboot before starting linux itself :

374522 bytes read in 31 ms (11.5 MiB/s)
## Flattened Device Tree blob at 88400000
   Booting using the fdt blob at 0x88400000
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
   Using Device Tree in place at 0000000088400000, end 000000008845e6f9
copying carveout for /host1x@13e00000/display-hub@15200000/display@15200000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15210000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15220000...

Is the dtbfile configured on the FDT clause in /boot/extlinux/extlinux.conf really loaded or not ?

My aim is actually to load another dtbfile, describing the devkit with our custom daughter board, where I get exactly the same messages from uboot, but then my kernel (exactly the same, I only change the dtb) fails in the dt parsing phase.

hello phdm,

could you please also check the kernel logs, for example, $ dmesg | grep DTS.
is kernel init process choose the correct device tree blob actually?

nvidia@devkit-jp451:~$ dmesg | grep DTS
[ 0.185688] DTS File Name: /tmp/macq-dev-tools-macq-build-dir-zTih1Sr4l/macq-cam5-kernel-4.9-4.9.201/src/kernel/kernel-4.9/arch/arm64/boot/dts/…/…/…/…/…/…/hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-macq-cam5-devkit.dts
[ 0.428212] DTS File Name: /tmp/macq-dev-tools-macq-build-dir-zTih1Sr4l/macq-cam5-kernel-4.9-4.9.201/src/kernel/kernel-4.9/arch/arm64/boot/dts/…/…/…/…/…/…/hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-macq-cam5-devkit.dts
nvidia@devkit-jp451:~$

The name matches what I expect, but nevertheless when the kernel parses the DT for smmu, it does strange things and fails. Debugging messages are mine.

[    0.423366] arm_smmu_device_dt_probe
[    0.423380] of_match_node(arm_smmu_of_match) = 2
[    0.423391] tegra_smmu_of_parse_sids(/iommu@12000000)
[    0.423403] domain_node = /iommu@12000000/domains
[    0.423413] child = /iommu@12000000/domains/gpu_domain
[    0.423450] as_node = /iommu@12000000/address-space-prop/gpu
[    0.423462] __tegra_smmu_parse_as_prop(gpu = /iommu@12000000/address-space-prop/gpu)
[    0.423481] child = /iommu@12000000/domains/host1x_domain
[    0.423513] as_node = /reserved-memory/fb0_carveout
[    0.423525] __tegra_smmu_parse_as_prop(fb0_carveout = /reserved-memory/fb0_carveout)
[    0.423544] arm-smmu 12000000.iommu: invalid address-space-prop fb0_carveout
[    0.423557] arm-smmu: Unable to parse tegra SIDs!
[    0.423573] arm-smmu: probe of 12000000.iommu failed with error -22

and looking at the device-tree from within linux, I get those error messages :

nvidia@devkit-jp451:~$ dtc -I fs /proc/device-tree
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/host1x_client has duplicated phandle 0x86 (seen before at /reserved-memory/fb1_carveout)
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/host1x has duplicated phandle 0x85 (seen before at /reserved-memory/fb0_carveout)
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/common has duplicated phandle 0x87 (seen before at /reserved-memory/fb2_carveout)
<stdout>: ERROR (explicit_phandles): /xusb_padctl@3520000/pads/usb3/lanes/usb3-0 has duplicated phandle 0x9b (seen before at /reserved-memory/vpr-carveout)
ERROR: Input tree has errors, aborting (use -f to force output)
nvidia@devkit-jp451:~$

Those errors are not present in my dtb file.

hello phdm,

just for confirmation,
let’s check if the error still persist by having identical device tree in both kernel-dtb partition and filesystem.
thanks

hello JerryChang,

how can I update the kernel-dtb partition with my dtb file from within the linux running on my devkit ? I know I must overwrite /dev/mmcblk0p30 and perhaps also /dev/mmcblk0p31, but I don’t know how to produce the exact content I must write there.

This is only for the purpose of the test you asked.

Actually I would like not need to change the kernel-dtb partitions when I switch the daughter-board on my devkit. That used to work with jetpack-4.3.

Later…
I have now found and used this tool nv-tegra-sign/sign.py https://github.com/kmartin36/nv-tegra-sign/ to produce a signed version of my dtb file, and I have copied the signed version to the kernel-dtb partition using

sudo dd if=/tmp/dtbfile.signed of=/dev/mmcblk0p30

The following messages still appear

## Flattened Device Tree blob at 88400000
   Booting using the fdt blob at 0x88400000
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
   Using Device Tree in place at 0000000088400000, end 0000000088442f3e
copying carveout for /host1x@13e00000/display-hub@15200000/display@15200000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15210000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15220000...

but in linux, my dtb file is not garbled anymore. This is clearly a bug in the way uboot uses the FDT entry now in jetpack-4.5.1, a bug that did not trigger for me with jetpack-4.3.

If I now try to boot with the original base dtb, I get similar errors as with the reverse case :

nvidia@devkit-jp451:~$ dtc -I fs /proc/device-tree
<stdout>: ERROR (explicit_phandles): /host1x/nvcsi@150c0000/channel@5 has duplicated phandle 0x166 (seen before at /reserved-memory/ramoops_carveout)
<stdout>: ERROR (explicit_phandles): /host1x/vi@15700000/ports/port@3/endpoint has duplicated phandle 0x6d (seen before at /reserved-memory/fb0_carveout)
<stdout>: ERROR (explicit_phandles): /host1x/vi@15700000/ports/port@4/endpoint has duplicated phandle 0x6f (seen before at /reserved-memory/fb2_carveout)
<stdout>: ERROR (explicit_phandles): /i2c@3180000/tca9548@77/i2c@4/imx219_e@10/ports/port@0/endpoint has duplicated phandle 0x6e (seen before at /reserved-memory/fb1_carveout)
<stdout>: ERROR (explicit_phandles): /pmc@c360000/hdmi-dp1-dpd-disable has duplicated phandle 0x83 (seen before at /reserved-memory/vpr-carveout)
ERROR: Input tree has errors, aborting (use -f to force output)
nvidia@devkit-jp451:~$

When putting a DTB in a partition it has to be signed. If not, then I suspect the corruption is from treating the file as if it is signed. You wouldn’t have that issue if naming it via the FDT entry in extlinux.conf. If you were to overwrite the partition while running live, then it would have to be signed and then written with a tool like dd. You would also need to be sure your new tree is small enough to fit in the partition.

In your flash software within the “Linux_for_Tegra/” directory look for “l4t_sign_image.sh”.

Hello linuxdev,

there exists a python script that can be run on a TX2 and that can sign a dtb to produce what should be written in /dev/disk/by-partlabel/kernel-dtb. You can find it at https://github.com/kmartin36/nv-tegra-sign/. Using it is simple :

nv-tegra-sign/sign.py /boot/dtbfile /tmp/dtbfile.signed

It worked perfectly for me, as I wrote above.

With jetpack-4.5.1 on TX2, when the kernel-dtb partition is flashed with the same dtb file (signed) as the one selected by the FDT line from extlinux.conf, the DT is correctly given to linux. But if they differ, then linux gets a corrupted DT, not the one from kernel-dtb, nor the one from the FDT line, but a corrupted one. Look at the messages produced by ‘dtc’ in my posts above. That’s a BUG !

hello phdm,

yes, it looks like a bug to me also.
since we would like to reproduce the same failure on the reference board. could you please share some info about your device tree modification?
thanks

Hello JerryChang,

here you are

dtbfile (255.8 KB)

basically, it describes the devkit with a image sensor connected to a spi bus

You might also want to mention the exact byte size of the old and new signed trees. I’m wondering if the new one is too large to fit the partition. As a test, if you use a non-signed version via the FDT entry in extlinux.conf (with the partition version being unmodified), then it would be quite interesting to know if this non-signed version does or does not produce the same error. You might also include the output of “sudo gdisk -l /dev/mmcblk0” (which will show approximate partition sizes).

my custom dtb file is smaller than the default dtb file, and thus also smaller than the kernel-dtb partition.
If I put a signed version of my custom dtb file in /dev/disk/by-partlabel/kernel-dtb, then my custom dtb is loaded correctly. It continues also to be loaded correctly when only the contents of nvidia,dtbbuildtime and similar strings has changed.

@phdm,

Let’s clear up a couple of your questions first.

Is the dtbfile configured on the FDT clause in /boot/extlinux/extlinux.conf really loaded or not ?

Yes, U-Boot always honors the DTB file listed via FDT. You can see that it was loaded at the beginning of your UART log dump in your initial post:
374522 bytes read in 31 ms (11.5 MiB/s)

I get strange messages coming from uboot before starting linux itself :

The 3 ‘ERROR: reserving fdt memory region failed’ messages from U-Boot are due to 3 fbX_carveout nodes in the DTB that are empty (0 address, 0 size). This is due to the fact that CBoot did not populate those nodes because it didn’t find a display attached/active for them. Typically only fb0_carveout is populated (HDMI), so you will always see these messages from U-Boot and can ignore them (you should see a similar message from the kernel).

Now on to phandle duplication/corruption in the DTB. When U-Boot is asked to load a DTB via FDT in extlinux.conf (a disk-based kernel DTB), it has to merge the internal DTB nodes/properties (from the RAM-based DTB) with this disk-based DTB before it can hand it to the kernel on boot. This feature is called DTB merge. It’s necessary because nvtboot, CBoot, and even U-Boot may have done HW probe to detect carveouts, plug-in boards, etc. that then need to be updated in the current DTB for kernel use. A disk-based DTB has none of this info, since it’s static.

When the disk-based DTB matches the ‘flashed’ DTB (that eventually becomes the RAM-based DTB), U-Boot has no issue with merging the HW info, since the phandles match. But when the disk-based DTB has been changed by an end-user/customer, depending on the extent of the changes, the disk-based DTB can have re-ordered phandles (as part of the recompilation that DTC does), which might now conflict with the phandles in the RAM-based DTB. I’ve recently made a change to L4T U-Boot to attempt to detect this and skip the merge/update of the external-memory-controller node (or any source node that uses phandles), and emit a warning during boot. But that change won’t be available until we release 32.6, which is due soon IIUC.

So in your case, it appears you are seeing duplicate phandles in the final DTB that was given to the kernel by U-Boot. This can cause the kernel driver that uses those nodes (iommu, xusb-padctl in your case) to spew warnings, abort and even hang the system. The fix for that (for now) is to use your ‘new’ DTB with your changes as the original, flashed DTB, so that nvtboot/CBoot/U-Boot all get the changed nodes/phandles from the start. Doing that means you do NOT need to also list it in extlinux.conf FDT. It’s less convenient, requiring you to either sign & ‘dd’ your new DTB from the kernel command line, or to reflash it via ‘flash.sh -k DTB “your-board” mmcblk0p1’, which will handle signing/reflashing just the DTB to your TX2’s flash (eMMC).

Another fix coming in R32.6 U-Boot is the addition of DTBO (overlay) support, which should mean you can add your DTB changes to an overlay DTB that gets loaded via extlinux.conf (label FDTOVERLAYS) by U-Boot, and any phandle issues are handled by the overlay support in the FDT lib. We’ve tested this with the same overlays (audio, CAN, SPI, etc.) that are available in the Jetson.IO package. It may be useful for your changes.

HTH,

Tom

2 Likes

Hello Tom,

thank you for your detailed explanation. That suggests me other questions, but the easiest and most urgent is : Can I get an early access to jetpack-4.6 ?

Philippe

hello phdm,

could you please have a try to update the u-boot binary with the attachment, Topic180197_Jun17_u-boot.zip (268.2 KB)
you should perform partition update to flash kernel partition for updating uboot binary.
please also check the bootloader logs to confirm the uboot version.
thanks

Thank you.

Can you provide me an already signed version of it, or a way to sign it locally on my TX2, so
that I can do simply

sudo dd if=u-boot.signed of=/dev/disk/by-partlabel/kernel

instead of going into recovery mode, connecting an USB cable, powering up a ubuntu PC and figuring out which commands I must type to simply overwrite that partition ?

1 Like

Hello!

I am facing with the same issue. As JerryChang suggested, I changed u-boot.bin in L4T/bootloader/t186ref/p3636-0001/u-boot.bin with the downloaded u-boot.bin.

I get TX2-NX to recovery mode and connect to host PC. After that I run this command in L4T to rewrite u-boot.bin:

sudo ./flash.sh -k LNX jetson-xavier-nx-devkit-tx2-nx mmcblk0p1

It gives this error at the end:

[ 12.4830 ] tegradevflash_v2 --write DTB /home/mert/nvidia/nvidia_sdk/JetPack_4.5.1_Linux_JETSON_TX2_NX/Linux_for_Tegra/bootloader/kernel_tegra186-p3636-0001-p3509-0000-a01.dtb
[ 12.4857 ] Bootloader version 01.00.0000
[ 12.8532 ] Writing partition DTB with /home/mert/nvidia/nvidia_sdk/JetPack_4.5.1_Linux_JETSON_TX2_NX/Linux_for_Tegra/bootloader/kernel_tegra186-p3636-0001-p3509-0000-a01.dtb
[ 12.8539 ] 000000000d0d000d: o open partition %s.
[ 12.8659 ]
[ 12.8659 ]
Error: Return value 13
Command tegradevflash_v2 --write DTB /home/mert/nvidia/nvidia_sdk/JetPack_4.5.1_Linux_JETSON_TX2_NX/Linux_for_Tegra/bootloader/kernel_tegra186-p3636-0001-p3509-0000-a01.dtb
Failed to flash/read t186ref.

This is the full output:
full-output.txt (21.0 KB)

You might be interested in this:
https://forums.developer.nvidia.com/t/can-runtime-update-sdmmc-boot-partition-after-enabling-fuse-by-rcm-boot-nfs-or-ota-upgrade/82334/2

hello phdm,

you should include --no-flash option to generate signed version locally.
for example,

$ sudo ./flash.sh --no-flash -r -k kernel jetson-xavier-nx-devkit-tx2-nx mmcblk0p1
...
[   0.0365 ] Signed file: $OUT/Linux_for_Tegra/bootloader/boot_sigheader.img.encrypt

after that,
you should copy the file to your target and use the dd commands to overwrite the kernel partition.
for example,

$ ls -al /dev/disk/by-partlabel
...
lrwxrwxrwx 1 root root  16 Jun 16 03:13 kernel -> ../../mmcblk0p28

hello mozturk,

this is incorrect command-line, please specify the partition as kernel to flash it.

1 Like

Hello JerryChang and TWarren,

Quick answer : I have tried the provided new u-boot binary, and it does not solve the bugs in loading the FDT-specified dtb.

Details :
on the ubuntu PC :

cd ~/nvidia/nvidia_sdk/
cd JetPack_4.5.1_Linux_JETSON_TX2_TARGETS/Linux_for_Tegra/
mv bootloader/t186ref/p2771-0000/500/u-boot.bin bootloader/t186ref/p2771-0000/500/u-boot.binbk
mv ~/Topic180197_Jun17_u-boot.bin bootloader/t186ref/p2771-0000/500/u-boot.bin
./flash.sh --no-flash -r -k kernel jetson-tx2 mmcblk0p1
scp -p ./bootloader/temp_user_dir/boot_sigheader.img.encrypt nvidia@devkit:

On the devkit :

sudo dd if=boot_sigheader.img.encrypt of=/dev/disk/by-partlabel/kernel

At reboot :

U-Boot 2020.04-g1290f21 (Jun 17 2021 - 15:56:34 +0800)

SoC: tegra186
Model: NVIDIA P2771-0000-500
Board: NVIDIA P2771-0000
DRAM:  7.8 GiB
MMC:   sdhci@3400000: 1, sdhci@3460000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

Checking the loaded DT :

nvidia@devkit-jp451:~$ dtc -I fs /proc/device-tree
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/host1x_client has duplicated phandle 0x86 (seen before at /reserved-memory/fb1_carveout)
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/host1x has duplicated phandle 0x85 (seen before at /reserved-memory/fb0_carveout)
<stdout>: ERROR (explicit_phandles): /iommu@12000000/address-space-prop/common has duplicated phandle 0x87 (seen before at /reserved-memory/fb2_carveout)
<stdout>: ERROR (explicit_phandles): /xusb_padctl@3520000/pads/usb3/lanes/usb3-0 has duplicated phandle 0x9b (seen before at /reserved-memory/vpr-carveout)
ERROR: Input tree has errors, aborting (use -f to force output)
nvidia@devkit-jp451:~$