Hello,
I have flashed a jetson TX2 on a devkit with jetpack-4.5.1 using the sdkmanager. The vanilla installation seems to work. I have actuallly a custom daughter-board that I can plug on the devkit, and I have created a dt containing the description of the devkit and the daughter-board.
I used to do exactly the same with jetpack-4.3, and that worked perfectly, but now with jetpack-4.5.1, starting linux fails in the early boot phase :
Retrieving file: /boot/dtbfile
260918 bytes read in 33 ms (7.5 MiB/s)
## Flattened Device Tree blob at 88400000
Booting using the fdt blob at 0x88400000
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
ERROR: reserving fdt memory region failed (addr=0 size=0)
Using Device Tree in place at 0000000088400000, end 0000000088442b35
copying carveout for /host1x@13e00000/display-hub@15200000/display@15200000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15210000...
copying carveout for /host1x@13e00000/display-hub@15200000/display@15220000...
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0x100
[ 0.000000] Linux version 4.9.201-jp451-0.macq~.jp451.fix.boot.crash-gc636435 (jenkinsbld@docker-macq-build-ubuntu18.04-64) (gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) ) #1 SMP PREEMPT Tue Jun 8 08:40:16 CEST 2021
[ 0.000000] Boot CPU: AArch64 Processor [411fd073]
[ 0.000000] OF: fdt:memory scan node memory@80000000, reg size 80,
[ 0.000000] OF: fdt: - 80000000 , 70000000
[ 0.000000] OF: fdt: - f0200000 , 185600000
[ 0.000000] OF: fdt: - 275e00000 , 200000
[ 0.000000] OF: fdt: - 276600000 , 200000
[ 0.000000] OF: fdt: - 277000000 , 200000
[ 0.000000] earlycon: uart8250 at MMIO32 0x0000000003100000 (options '')
[ 0.000000] bootconsole [uart8250] enabled
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb2_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb2_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb1_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb1_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb0_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: fdt:Reserved memory: failed to reserve memory for node 'fb0_carveout': base 0x0000000000000000, size 0 MiB
[ 0.000000] OF: reserved mem: initialized node vpr-carveout, compatible id nvidia,vpr-carveout
[ 0.000000] OF: reserved mem: initialized node ramoops_carveout, compatible id nvidia,ramoops
[ 0.000000] cma: Reserved 64 MiB at 0x00000000fc000000
[ 0.000000] psci: probing for conduit method from DT.
[ 0.000000] psci: PSCIv1.0 detected in firmware.
[ 0.000000] psci: Using standard PSCI v0.2 function IDs
[ 0.000000] psci: MIGRATE_INFO_TYPE not supported.
[ 0.000000] psci: SMC Calling Convention v1.1
[ 0.000000] percpu: Embedded 24 pages/cpu s58200 r8192 d31912 u98304
[ 0.000000] Speculative Store Bypass Disable mitigation not required
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 2022968
[ 0.000000] Kernel command line: console=ttyS0,115200 androidboot.presilicon=true firmware_class.path=/etc/firmware root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 isolcpus=1-2 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x3100000 nvdumper_reserved=0x2772e0000 gpt rootfs.slot_suffix= usbcore.old_scheme_first=1 tegraid=18.1.2.0.0 maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.2031647.1 vpr_resize bl_prof_dataptr=0x10000@0x275840000 sdhci_tegra.en_boot_part_access=1
[ 0.000000] log_buf_len individual max cpu contribution: 32768 bytes
[ 0.000000] log_buf_len total cpu_extra contributions: 163840 bytes
[ 0.000000] log_buf_len min size: 262144 bytes
[ 0.000000] log_buf_len: 524288 bytes
[ 0.000000] early log buf free: 258616(98%)
[ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[ 0.000000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
[ 0.000000] Memory: 7292184K/8220672K available (15166K kernel code, 2692K rwdata, 5936K rodata, 2688K init, 851K bss, 174824K reserved, 753664K cma-reserved)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xffffff8000000000 - 0xffffff8008000000 ( 128 MB)
[ 0.000000] vmalloc : 0xffffff8008000000 - 0xffffffbebfff0000 ( 250 GB)
[ 0.000000] .text : 0xffffff8008080000 - 0xffffff8008f50000 ( 15168 KB)
[ 0.000000] .rodata : 0xffffff8008f50000 - 0xffffff8009520000 ( 5952 KB)
[ 0.000000] .init : 0xffffff8009520000 - 0xffffff80097c0000 ( 2688 KB)
[ 0.000000] .data : 0xffffff80097c0000 - 0xffffff8009a61008 ( 2693 KB)
[ 0.000000] .bss : 0xffffff8009a61008 - 0xffffff8009b35f4c ( 852 KB)
[ 0.000000] fixed : 0xffffffbefe7fd000 - 0xffffffbefec00000 ( 4108 KB)
[ 0.000000] PCI I/O : 0xffffffbefee00000 - 0xffffffbeffe00000 ( 16 MB)
[ 0.000000] vmemmap : 0xffffffbf00000000 - 0xffffffc000000000 ( 4 GB maximum)
[ 0.000000] 0xffffffbf00000000 - 0xffffffbf07dc8000 ( 125 MB actual)
[ 0.000000] memory : 0xffffffc000000000 - 0xffffffc1f7200000 ( 8050 MB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
[ 0.000000] Preemptible hierarchical RCU implementation.
[ 0.000000] Build-time adjustment of leaf fanout to 64.
[ 0.000000] RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=6.
[ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=6
[ 0.000000] NR_IRQS:64 nr_irqs:64 0
[ 0.000000] GIC: Using split EOI/Deactivate mode
[ 0.000000] arm_arch_timer: Architected cp15 timer(s) running at 31.25MHz (phys).
[ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xe6a171046, max_idle_ns: 881590405314 ns
[ 0.000003] sched_clock: 56 bits at 31MHz, resolution 32ns, wraps every 4398046511088ns
[ 0.009414] Console: colour dummy device 80x25
[ 0.014074] console [tty0] enabled
[ 0.017628] bootconsole [uart8250] disabled
and then it reboots :(
struggling with kernel sources, I finally discovered that there is a panic about dereferencing a null pointer but that it happens after bootconsole is disabled but before console has a chance to print anything, and tracked it down to a use after free of a dynamically allocated struct in drivers/iommu/arm-smmu.c
More details :
arm_smmu_device_dt_probe allocates a ‘struct arm_smmu_device’ using devm_kzalloc, and keeps a reference to it in ‘smmu_handle’. If for any reason arm_smmu_device_dt_probe returns with an error, all the memory allocated using devm_ family functions is freed, including the one pointed by ‘smmu_handle’ at the following point
[ 5.227613] [<ffffff800822ab68>] kfree+0x2d0/0x2d8
[ 5.232617] [<ffffff800882cee0>] release_nodes+0x138/0x208
[ 5.238338] [<ffffff800882d454>] devres_release_all+0x3c/0x60
[ 5.244336] [<ffffff8008827ee0>] driver_probe_device+0x2b0/0x450
[ 5.250607] [<ffffff8008828250>] __device_attach_driver+0xa8/0x148
[ 5.257059] [<ffffff8008825888>] bus_for_each_drv+0x58/0xa8
[ 5.262878] [<ffffff8008827a8c>] __device_attach+0xbc/0x138
[ 5.268704] [<ffffff800882836c>] device_initial_probe+0x24/0x30
[ 5.274887] [<ffffff8008826bb4>] bus_probe_device+0x9c/0xa8
[ 5.280705] [<ffffff8008824218>] device_add+0x3d0/0x5d8
[ 5.286161] [<ffffff8008c448d0>] of_device_add+0x40/0x50
[ 5.291712] [<ffffff8008c450ac>] of_platform_device_create_pdata+0x9c/0x100
[ 5.298989] [<ffffff8008c45148>] of_platform_device_create+0x38/0x48
[ 5.305629] [<ffffff800956ebb8>] arm_smmu_of_setup+0xdc/0x118
[ 5.311625] [<ffffff800956e7b0>] of_iommu_init+0x48/0x90
[ 5.317172] [<ffffff8008083bfc>] do_one_initcall+0x104/0x148
[ 5.323086] [<ffffff8009530d10>] kernel_init_freeable+0x1bc/0x25c
[ 5.329456] [<ffffff8008f3e4a0>] kernel_init+0x18/0x108
[ 5.334918] [<ffffff80080838a0>] ret_from_fork+0x10/0x30
The freed memory is later allocated to some other kernel driver, and when smmu_handle is again used, one gets this :
[ 3.271180] Call trace:
[ 3.273735] [<ffffff800870e42c>] arm_smmu_add_device+0x124/0x5e0
[ 3.280006] [<ffffff8008706170>] iommu_bus_notifier+0xe8/0x138
[ 3.286095] [<ffffff80080dc66c>] notifier_call_chain+0x5c/0xa0
[ 3.292184] [<ffffff80080dd12c>] blocking_notifier_call_chain+0x64/0x88
[ 3.299092] [<ffffff8008823f8c>] device_add+0x3bc/0x5d8
[ 3.304549] [<ffffff8008c44568>] of_device_add+0x40/0x50
[ 3.310095] [<ffffff8008c44d44>] of_platform_device_create_pdata+0x9c/0x100
[ 3.317363] [<ffffff8008c45034>] of_platform_bus_create+0x104/0x468
[ 3.323908] [<ffffff8008c4559c>] of_platform_populate+0x8c/0x140
[ 3.330181] [<ffffff80095712bc>] of_platform_default_populate_init+0x68/0x7c
[ 3.337541] [<ffffff8008083bf0>] do_one_initcall+0xf8/0x130
[ 3.343358] [<ffffff8009520d10>] kernel_init_freeable+0x1bc/0x25c
[ 3.349723] [<ffffff8008f3df70>] kernel_init+0x18/0x108
[ 3.355178] [<ffffff80080838a0>] ret_from_fork+0x10/0x30
[ 3.360738] ---[ end trace 795dde86e029b986 ]---
[ 3.369631] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 3.369631]
[ 3.379187] SMP: stopping secondary CPUs
[ 3.387232] Rebooting in 5 seconds..
Here is a possible patch to avoid the silent panic :
index 35735329bde4..3fdf5baca54c 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2771,7 +2771,29 @@ arm_smmu_device_dt_probe(struct platform_device *pdev)
if (tegra_platform_is_unit_fpga())
return -ENODEV;
- smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
+ smmu = kzalloc(sizeof(*smmu), GFP_KERNEL);
if (!smmu) {
dev_err(dev, "failed to allocate arm_smmu_device\n");
return -ENOMEM;
I don’t know if the smmu driver works really well thereafter, but at least the kernel does not crash and we get some informative messages.