boot hangs in secondary CPU bring up

Hi All,

We are using jetson-tk1 board for the development.

We are using zImage with initramfs included in zImage itself. We are facing issue when zImage with initramfs size increases to more than around 7 MB. Kernel boot up hangs when it tries to power on other CPUs. If we add maxcpus=1 in bootargs then kernel boot up is ok but we get only one CPU.
Also if we reduce zImage size then kernel boot up is ok with multiple CPU. After debugging we found that issue happens when kernel decompression starting from address 0x80008000 and ends beyond address 0x8100 0000.

So it seems like our bigger zImage during compression overrides some data at address 0x81000000 and it causes boot up to hang. We are loading kernel at 0x82000000 and dtb at 0x83000000 from uboot.

Is there anything hardcoded in kernel for address 0x81000000 ?

Thanks in advance.

The physical memory space where the kernel is initially loaded sits just above the space reserved for modules and initial ramdisk. The total size reserved for both spaces together is 32MB. A typical failure is when modules have 16MB module + 16MB initrd reserved (thus reserving 32MB total), but modules are larger than 16MB. You won’t usually get an outright failure, instead what you’ll see is that on some modules there is a spinlock or similar piece of code which has a branch instruction out of range error (initrd is placed between kernel lower address and modules, thus part of one or more modules can be out of range due to initrd size); unless the right options are on, you may not see this logged.

In your case it sounds like a critical block of code from a module is unreachable due to the 32MB branch instruction limitation.

If you need more module space (taking from initrd space) you’ll find the only option is to allocate a slightly different amount of space between modules and initrd, where the total space is always 32MB. The default kernel setting is via “CONFIG_TASK_SIZE_3G_LESS_24M” (versus the mutually exclusive “CONFIG_TASK_SIZE_3G_LESS_16M”). The default takes 32MB and reserves 24M for initrd, leaving modules with 8MB (32MB - 24MB); the mutually exclusive inverse option reserves 16M for initrd, thus allowing up to 16MB (32MB - 16MB) for modules.

Sorry, I don’t remember which menuconfig path gets to that option. Unless your initrd actually requires more than 16MB this is a fairly easy fix. If you don’t see the config menu option which deals with this I can probably find it…hand editing of the .config file isn’t advised, there may be other changes to configuration when switching between those two options. If you need more than 32MB total, you’re out of luck…the limitation is on the assembler branch instruction and would require massive re-writes to get larger spaces to work. A good work-around is to compile previously module-format features to become non-module/integrated (thus reducing module space total size required).

we enabled CONFIG_TASK_SIZE_3G_LESS_16M but it did not resolve issue.

we have added our own logging in decompression routine to find out end address of decompression. and we found that when decompression address crosses 0x81000000 limit, issue occurs.

following is the log we get after enabling low level logs and early printk

U-Boot SPL 2014.10-rc2 (Jun 15 2015 - 14:01:30)

U-Boot 2014.10-rc2 (Jun 15 2015 - 14:01:30)

TEGRA124
Board: NVIDIA Jetson TK1
I2C: ready
DRAM: 2 GiB
MMC: Tegra SD/MMC: 0, Tegra SD/MMC: 1
tegra-pcie: PCI regions:
tegra-pcie: I/O: 0x12000000-0x12010000
tegra-pcie: non-prefetchable memory: 0x13000000-0x20000000
tegra-pcie: prefetchable memory: 0x20000000-0x40000000
tegra-pcie: 2x1, 1x1 configuration
W?®¬K\¬ZYprobing port 0, using 2 lanes
tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, ignoring
tegra-pcie: probing port 1, using 1 lanes
In: serial
Out: serial
Err: serial
Net: RTL8169#0
Hit any key to stop autoboot: 0
Tegra124 (Jetson TK1) # boot
MMC: no card present
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0…
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
3014 bytes read in 261 ms (10.7 KiB/s)
Jetson-TK1 eMMC boot options
1: primary kernel
2: zImage my build
3: zImage with initramfs
4: uImage with initramfs
Enter choice: 3
3: zImage with initramfs
Retrieving file: /boot/zImage.initramfs
7427232 bytes read in 518 ms (13.7 MiB/s)
append: noinitrd console=ttyS0,115200n8 maxpus=10 no_console_suspend=1 lp0_vec=2064@0xf46ff000 mem=2015M@2048M memtype=255 ddr_die=2048M@2048M section=256M pmuboard=0x0177:0x0000:0x02:0x43:0x00 tsec=32M@3913M otf_key=c75e5bb91eb3bd947560357b64422f85 usbcore.old_scheme_first=1 core_edp_mv=1150 core_edp_ma=4000 tegraid=40.1.1.0.0 debug_uartport=lsport,3 power_supply=Adapter audio_codec=rt5640 modem_id=0 android.kerneltype=normal fbcon=map:1 commchip_id=0 usb_port_owner_info=0 lane_owner_info=6 emc_max_dvfs=0 touch_id=0@0 board_info=0x0177:0x0000:0x02:0x43:0x00 root=/dev/mmcblk0p1 rw rootwait tegraboot=sdmmc gpt
Retrieving file: /boot/tegra_new.dtb
56779 bytes read in 327 ms (168.9 KiB/s)
Kernel image @ 0x82000000 [ 0x000000 - 0x7154a0 ]

Flattened Device Tree blob at 83000000

Booting using the fdt blob at 0x83000000
Using Device Tree in place at 83000000, end 83010dca

Starting kernel …

free_mem_ptr Print int=827164C0 done
free_mem_end_ptr Print int=827264C0 done
input data Print int=82004466 done
input data len Print int=00710FFC done
output data Print int=80078000 done
Uncompressing Linux…

out_len=

Print int=

7FF87FFF

done

decompress done outbuf=

Print int=

810003B4

done
done, booting the kernel.
Booting Linux on physical CPU 0x0
Initializing cgroup subsys cpu
Initializing cgroup subsys cpuacct
Linux version 3.10.40 (einfochips@AHMCPU1073) (gcc version 4.9.1 (GCC) ) #2 SMP PREEMPT Tue Jul 7 13:10:37 IST 2015
CPU: ARMv7 Processor [413fc0f3] revision 3 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
Machine: jetson-tk1, model: NVIDIA Tegra124 PM375, serial: 0
size=0x80000000 aligned_start=0x80000000
Truncating memory at 0x80000000 to fit in 32-bit physical address space
early_mem:622 start=0x80000000 size=0x7df00000
size=0x80000000 aligned_start=0x80000000
Found tsec, start=0xf4900000 size=2000000Tegra reserved memory:
LP0: f46ff000 - f46ff80f
Bootloader framebuffer: 00000000 - 00000000
Bootloader framebuffer2: 00000000 - 00000000
Framebuffer: f8500000 - f96fffff
2nd Framebuffer: f9700000 - fdefffff
Carveout: 00000000 - 00000000
Vpr: 00000000 - 00000000
Tsec: f4900000 - f68fffff
cma: CMA: reserved 16 MiB at ae800000
Memory policy: ECC disabled, Data cache writealloc
On node 0 totalpages: 492800
free_area_init_node: node 0, pgdat c0fe82c0, node_mem_map c1279000
Normal zone: 1520 pages used for memmap
Normal zone: 0 pages reserved
Normal zone: 194560 pages, LIFO batch:31
HighMem zone: 2510 pages used for memmap
HighMem zone: 298240 pages, LIFO batch:31
tegra_get_chipid =40
cpuid =0c0f
available_cpus =4
DTS File Name: /home/einfochips/tegra/MEP_TEGRA/src-mep/kernel/linux-3.10.40/arch/arm/boot/dts/jetson-tegra.dts
Tegra12: CPU Speedo value 2297, Soc Speedo value 2256, Gpu Speedo value 2130
Tegra12: CPU Speedo ID 1, Soc Speedo ID 1, Gpu Speedo ID 1
Tegra12: CPU Process ID 1,Soc Process ID 1,Gpu Process ID 1
Tegra Revision: A01 SKU: 0x81 CPU Process: 1 Core Process: 1
tegra: PLLP fixed rate: 408000000
tegra_clk_shared_bus_user_init: c2bus client se left ON
tegra_clk_shared_bus_user_init: c4bus client vi left ON
Lowering cpu_lp maximum rate from 1350000000 to 1092000000
Lowering sbus maximum rate from 420000000 to 384000000
Lowering vic03 maximum rate from 900000000 to 828000000
Lowering tsec maximum rate from 900000000 to 828000000
Lowering msenc maximum rate from 600000000 to 528000000
Lowering se maximum rate from 600000000 to 528000000
Lowering vde maximum rate from 600000000 to 528000000
Lowering host1x maximum rate from 500000000 to 444000000
Lowering vi maximum rate from 700000000 to 600000000
Lowering isp maximum rate from 700000000 to 600000000
Lowering c4bus maximum rate from 700000000 to 600000000
Lowering pll_c maximum rate from 1400000000 to 1066000000
Lowering pll_c2 maximum rate from 1200000000 to 1066000000
Lowering pll_c3 maximum rate from 1200000000 to 1066000000
Lowering hdmi maximum rate from 594000000 to 297000000
Lowering sdmmc1 maximum rate from 208000000 to 204000000
Lowering sdmmc3 maximum rate from 208000000 to 204000000
Lowering gbus maximum rate from 1032000000 to 852000000
Lowering cpu_g maximum rate from 3000000000 to 2320500000
tegra dvfs: VDD_CPU nominal 1260mV, scaling enabled
tegra dvfs: VDD_CORE nominal 1150mV, scaling enabled
tegra dvfs: VDD_GPU nominal 1200mV, scaling enabled
Switching to timer-based delay loop
tegra_powergate_init: DONE
tegra12_plle_clk_enable: pll_e is already enabled
PERCPU: Embedded 9 pages/cpu @c2262000 s15360 r8192 d13312 u36864
pcpu-alloc: s15360 r8192 d13312 u36864 alloc=9*4096
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 491280
Kernel command line: noinitrd console=ttyS0,115200n8 maxpus=10 no_console_suspend=1 lp0_vec=2064@0xf46ff000 mem=2015M@2048M memtype=255 ddr_die=2048M@2048M section=256M pmuboard=0x0177:0x0000:0x02:0x43:0x00 tsec=32M@3913M otf_key=c75e5bb91eb3bd947560357b64422f85 usbcore.old_scheme_first=1 core_edp_mv=1150 core_edp_ma=4000 tegraid=40.1.1.0.0 debug_uartport=lsport,3 power_supply=Adapter audio_codec=rt5640 modem_id=0 android.kerneltype=normal fbcon=map:1 commchip_id=0 usb_port_owner_info=0 lane_owner_info=6 emc_max_dvfs=0 touch_id=0@0 board_info=0x0177:0x0000:0x02:0x43:0x00 root=/dev/mmcblk0p1 rw rootwait tegraboot=sdmmc gpt
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1925MB = 1925MB total
Memory: 1918948k/1918948k available, 144412k reserved, 1192956K highmem
Virtual kernel memory layout:
vector : 0xffff0000 - 0xffff1000 ( 4 kB)
fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
vmalloc : 0xf0000000 - 0xff000000 ( 240 MB)
lowmem : 0xc0000000 - 0xef800000 ( 760 MB)
pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB)
modules : 0xbf000000 - 0xbfe00000 ( 14 MB)
.text : 0xc0078000 - 0xc0ab89a4 (10499 kB)
.init : 0xc0ab9000 - 0xc0eb0c00 (4063 kB)
.data : 0xc0eb2000 - 0xc10003b4 (1337 kB)
.bss : 0xc10003b4 - 0xc108b14c ( 556 kB)
Preemptible hierarchical RCU implementation.
NR_IRQS:960
the number of interrupt controllers found is 5Architected local timer running at 12.00MHz (phys).
sched_clock: 56 bits at 12MHz, resolution 83ns, wraps every 2863311536128ns
Ignoring duplicate/late registration of read_current_timer delay
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer frequency… lpj=12000
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
Initializing cgroup subsys debug
Initializing cgroup subsys freezer
CPU: Testing write buffer coherency: ok
ftrace: allocating 30631 entries in 60 pages
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
tegra_smp_prepare_cpus:419 max_cpu=4
Setting up static identity map for 0xc07ac888 - 0xc07ac934
ftrace: Allocated trace_printk buffers
Bring up 0 cpu max_cpu=4
Bring up 1 cpu max_cpu=4
before cpu_up
before boot_secondary stack=ee283ff8 pgdir=ae0f8000 pg_dir=80074000
tegra_boot_secondary:295 cpu=1
tegra11x_power_up_cpu:265 cpu=1
tegra11x_power_up_cpu:267 cpu=1
tegra11x_power_up_cpu:279 cpu=1 id=9
tegra11x_power_up_cpu:281 cpu=1
tegra11x_power_up_cpu:283 cpu=1
secondary_start_kernel:329
secondary_start_kernel:339
secondary_start_kernel:349
secondary_start_kernel:354
tegra_secondary_init:170 cpu=1
secondary_start_kernel:363
tegra11x_power_up_cpu:285 cpu=1
secondary_start_kernel:366
CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
tegra_boot_secondary:387 status=0
after boot_secondary ret=0
after cpu_up
Bring up 2 cpu max_cpu=4
before cpu_up
before boot_secondary stack=ee285ff8 pgdir=ae0f8000 pg_dir=80074000
tegra_boot_secondary:295 cpu=2
tegra11x_power_up_cpu:265 cpu=2
tegra11x_power_up_cpu:267 cpu=2
tegra11x_power_up_cpu:279 cpu=2 id=10
tegra11x_power_up_cpu:281 cpu=2
tegra11x_power_up_cpu:283 cpu=2
secondary_start_kernel:329
secondary_start_kernel:339
secondary_start_kernel:349
secondary_start_kernel:354
tegra_secondary_init:170 cpu=2
secondary_start_kernel:363
tegra11x_power_up_cpu:285 cpu=2
secondary_start_kernel:366
CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
tegra_boot_secondary:387 status=0
after boot_secondary ret=0
after cpu_up
Bring up 3 cpu max_cpu=4
before cpu_up
before boot_secondary stack=ee287ff8 pgdir=ae0f8000 pg_dir=80074000
tegra_boot_secondary:295 cpu=3
tegra11x_power_up_cpu:265 cpu=3
tegra11x_power_up_cpu:267 cpu=3
tegra11x_power_up_cpu:279 cpu=3 id=11
tegra11x_power_up_cpu:281 cpu=3
tegra11x_power_up_cpu:283 cpu=3
secondary_start_kernel:329
secondary_start_kernel:339
secondary_start_kernel:349
secondary_start_kernel:354
tegra_secondary_init:170 cpu=3
secondary_start_kernel:363
tegra11x_power_up_cpu:285 cpu=3
secondary_start_kernel:366
CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
tegra_boot_secondary:387 status=0
after boot_secondary ret=0
after cpu_up
Brought up 4 CPUs
SMP: Total of 4 processors activated.
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
pinctrl core: initialized pinctrl subsystem
regulator-dummy: no parameters
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
tegra_smmu tegra_smmu: Loaded Tegra IOMMU driver
cpuidle: using governor ladder
cpuidle: using governor menu
ardbeg_camera_auxdata: update camera lookup table.
tegra-gpio 6000d000.gpio: Initialising GPIO state 0: name default
gpiochip_add: registered GPIOs 0 to 255 on device: tegra-gpio
Wake16 for irq=34
Wake58 for irq=81
Wake41 for irq=129
Wake43 for irq=129
Wake40 for irq=53
Wake42 for irq=53
board_info: id:sku:fab:major:minor = 0x0177:0x0000:0x03:0x45:0x00
board_info: id:sku:fab:major:minor = 0x0177:0x0000:0x03:0x45:0x00
disp1 pclk=154700000
disp2 pclk=297000000
Selecting UARTD as the debug console
The debug console clock name is uartd
ardbeg_modem_init: modem_id = 0
Clear bootloader IO dpd settings
Loading jetson TK1 EMC tables.
tegra: pll_m is selected as scalable EMC clock source
Lowering emc maximum rate from 1200000000 to 924000000
tegra: validated EMC DFS table
laguna_edp_init: CPU regulator 15000 mA
laguna_edp_init: GPU regulator 8000 mA
swapper/0 isomgr_init(): iso emc max clk=924000KHzswapper/0 isomgr_init(): max_iso_bw=7392000KBardbeg_touch_init init raydium touch
Raydium - touch platform_id : 8
platform tegradc.0: IOVA linear map 0xf8500000(1200000)
platform tegradc.0: IOVA linear map 0xf9700000(4800000)
platform tegradc.1: IOVA linear map 0xf8500000(1200000)
platform tegradc.1: IOVA linear map 0xf9700000(4800000)
tegra11_soctherem_oc_int_init(): OC interrupts are not enabled
hw-breakpoint: found 5 (+1 reserved) breakpoint and 4 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 8 bytes.
mc-err: Started MC error interface!
bio: create slab at 0
reg-fixed-voltage 0.regulator: Consumer c1 does not have device name
reg-fixed-voltage 0.regulator: Consumer c2 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
vdd-ac-bat: 8400 mV
reg-fixed-voltage 1.regulator: Consumer c1 does not have device name
reg-fixed-voltage 1.regulator: Consumer c2 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
vdd-3v3-aon: 3300 mV
reg-fixed-voltage 8.regulator: Consumer c1 does not have device name
reg-fixed-voltage 8.regulator: Consumer c2 does not have device name
reg-fixed-voltage 8.regulator: Consumer c3 does not have device name
reg-fixed-voltage 8.regulator: Consumer c6 does not have device name
reg-fixed-voltage 8.regulator: Consumer c7 does not have device name
reg-fixed-voltage 8.regulator: Consumer c8 does not have device name
reg-fixed-voltage 8.regulator: Consumer c9 does not have device name
reg-fixed-voltage 8.regulator: Consumer c10 does not have device name
reg-fixed-voltage 8.regulator: Consumer c11 does not have device name
reg-fixed-voltage 8.regulator: Consumer c12 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
reg-3v3-supply: 3300 mV
vgaarb: loaded
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
of_get_named_gpio_flags exited with status 20
of_get_named_gpio_flags exited with status 21
of_get_named_gpio_flags exited with status 157
of_get_named_gpio_flags exited with status 158
of_get_named_gpio_flags exited with status 217
of_get_named_gpio_flags exited with status 218
of_get_named_gpio_flags exited with status 172
of_get_named_gpio_flags exited with status 173
of_get_named_gpio_flags exited with status 206
of_get_named_gpio_flags exited with status 207
as3722 4-0040: AS3722 ID: ID1:ID2:ID3 = 0x0c:0x01:0x15
as3722 4-0040: Final OTP version 1V21
gpiochip_find_base: found new base at 1016
gpiochip_add: registered GPIOs 1016 to 1023 on device: as3722-gpio
GPIO chip as3722-gpio: created GPIO range 0->7 ==> as3722-pinctrl PIN 0->7
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
as3722-regulator as3722-regulator.0: Consumer c4 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
as3722-regulator as3722-regulator.0: Consumer c4 does not have device name
as3722-regulator as3722-regulator.0: Consumer c5 does not have device name
as3722-regulator as3722-regulator.0: Consumer c6 does not have device name
as3722-regulator as3722-regulator.0: Consumer c7 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
as3722-regulator as3722-regulator.0: Consumer c4 does not have device name
as3722-regulator as3722-regulator.0: Consumer c6 does not have device name
as3722-regulator as3722-regulator.0: Consumer c8 does not have device name
as3722-regulator as3722-regulator.0: Consumer c9 does not have device name
as3722-regulator as3722-regulator.0: Consumer c10 does not have device name
as3722-regulator as3722-regulator.0: Consumer c11 does not have device name
as3722-regulator as3722-regulator.0: Consumer c12 does not have device name
as3722-regulator as3722-regulator.0: Consumer c13 does not have device name
as3722-regulator as3722-regulator.0: Consumer c14 does not have device name
as3722-regulator as3722-regulator.0: Consumer c15 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
as3722-regulator as3722-regulator.0: Consumer c4 does not have device name
as3722-regulator as3722-regulator.0: Consumer c5 does not have device name
as3722-regulator as3722-regulator.0: Consumer c6 does not have device name
as3722-regulator as3722-regulator.0: Consumer c7 does not have device name
as3722-regulator as3722-regulator.0: Consumer c8 does not have device name
as3722-regulator as3722-regulator.0: Consumer c9 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
as3722-regulator as3722-regulator.0: Consumer c8 does not have device name
as3722-regulator as3722-regulator.0: Consumer c9 does not have device name
as3722-regulator as3722-regulator.0: Consumer c10 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c1 does not have device name
as3722-regulator as3722-regulator.0: Consumer c2 does not have device name
as3722-regulator as3722-regulator.0: Consumer c3 does not have device name
vdd-cpu: 650 <–> 1300 mV at 1000 mV 3500 mA
vdd-core: 700 <–> 1350 mV at 1000 mV 3500 mA
vddio-ddr: at 1350 mV
as3722-sd3: no parameters
avdd-pll-pex: 1050 mV
vdd-1v8: at 1800 mV
vdd-gpu: applied init 1000000uV constraint
vdd-gpu: 650 <–> 1200 mV at 1000 mV 3500 mA
avdd-pll: at 1050 mV at 300 mA
vdd-cam: 1800 mV at 150 mA
avdd-dsi-csi: at 1200 mV at 150 mA
vdd-rtc: 800 mV at 150 mA
avdd-cam: 2800 mV at 150 mA
vdd-1v2-cam: 1175 mV at 150 mA
vddio-sdmmc-2: 1800 <–> 3300 mV at 150 mA
vdd-1v1-cam: 1275 mV at 150 mA
avdd-spi: 3300 mV at 150 mA
vdd-2v7-cam: 2800 mV at 150 mA
vpp-fuse: 1800 mV at 150 mA
of_get_named_gpio_flags: can’t parse gpios property
of_get_named_gpio_flags: can’t parse gpios property
Linux video capture interface: v2.00
Advanced Linux Sound Architecture Driver Initialized.
ardbeg_wifi_power: 1
of_get_named_gpio_flags exited with status 108
usb0-vbus: 5000 mV
of_get_named_gpio_flags exited with status 109
usb1-usb2-vbus: 5000 mV
of_get_named_gpio_flags exited with status 86
vdd-hdmi: 5000 mV
of_get_named_gpio_flags exited with status 63
avdd-hdmi-pll: 3300 mV
avdd-hdmi-pll: supplied by avdd-pll-pex
reg-fixed-sync-voltage 6.regulator: Consumer c1 does not have device name
of_get_named_gpio_flags exited with status 122
vdd-lcd-bl: 3300 mV
reg-fixed-sync-voltage 7.regulator: Consumer c1 does not have device name
of_get_named_gpio_flags exited with status 58
vdd-lcd-bl-en: 5000 mV
reg-fixed-sync-voltage 9.regulator: Consumer c3 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
reg-5v0-supply: 5000 mV
reg-fixed-sync-voltage a.regulator: Consumer c1 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c2 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c3 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c10 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c11 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c12 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c13 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c14 does not have device name
reg-fixed-sync-voltage a.regulator: Consumer c15 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
reg-1v8-supply: 1800 mV
reg-fixed-sync-voltage b.regulator: Consumer c1 does not have device name
of_get_named_gpio_flags exited with status 138
reg-dcdc-1v2: 1200 mV
reg-fixed-sync-voltage c.regulator: Consumer c7 does not have device name
reg-fixed-sync-voltage c.regulator: Consumer c8 does not have device name
of_get_named_gpio_flags exited with status 1018
as3722-gpio2-supply: 3300 mV
reg-fixed-sync-voltage d.regulator: Consumer c1 does not have device name
of_get_named_gpio_flags exited with status 1020
as3722-gpio4-supply: 3300 mV
of_get_named_gpio_flags exited with status 136
sdmmc-en-supply: 3300 mV
of_get_named_gpio_flags exited with status 138
vdd-cdc-1v2-aud: 1200 mV
reg-fixed-sync-voltage 10.regulator: Consumer c1 does not have device name
reg-fixed-sync-voltage 10.regulator: Consumer c2 does not have device name
reg-fixed-sync-voltage 10.regulator: Consumer c3 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
reg-aon-1v8: 1800 mV
reg-fixed-sync-voltage 11.regulator: Consumer c1 does not have device name
of_get_named_gpio_flags: can’t parse gpios property
reg-aon-1v2: 1200 mV
as3722-adc-extcon as3722-adc-extcon.2: USB-Host is disconnected
tegra: started io power detection dynamic control
tegra: NO_IO_POWER setting 0x0
Switching to clocksource arch_sys_counter
nvmap_heap_init: nvmap_heap_init: created heap block cache
nvmap_page_pool_init: Total MB RAM: 1889
nvmap_page_pool_init: nvmap page pool size: 60448 pages (236 MB)
nvmap_page_pool_init: highmem=25600, pool_size=60448,totalram=483833, freeram=455167, totalhigh=298239, freehigh=272569
iram: dma coherent mem declare 0x40001000,258048
misc nvmap: created heap iram base 0x40001000 size (252KiB)
nvmap:inner cache maint threshold=2097152Wake39 for irq=52
tegra-otg tegra-otg: otg transceiver registered
NET: Registered protocol family 2
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP: reno registered
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
PCI: CLS 0 bytes, default 64
tegra-fuse tegra-fuse: Fuse driver initialized succesfully
host1x host1x: initialized
CPU PMU: probing PMU on CPU 1
hw perfevents: enabled with ARMv7 Cortex-A15 PMU driver, 7 counters available
tegra_throttle : init passed
Tegra cpuquiet initialized: disabled
cpu-tegra: init EDP limit: 2320 MHz
thermal thermal_zone0: Registering thermal zone thermal_zone0 for type CPU-therm
thermal thermal_zone1: Registering thermal zone thermal_zone1 for type GPU-therm
thermal thermal_zone2: Registering thermal zone thermal_zone2 for type MEM-therm
thermal thermal_zone3: Registering thermal zone thermal_zone3 for type PLL-therm
bounce pool size: 64 pages
Installing knfsd (copyright © 1996 okir@monad.swb.de).
NTFS driver 2.1.30 [Flags: R/O].
fuse init (API version 7.22)
msgmni has been set to 1449
io scheduler noop registered (default)
of_get_named_gpio_flags: can’t parse gpios property
of_get_named_gpio_flags: can’t parse gpios property
of_get_named_gpio_flags: can’t parse gpios property

there are two issues

  1. After call to pmc_writel(reg, PWRGATE_TOGGLE); next print comes after around 40 seconds.
    in log above log that is ‘tegra11x_power_up_cpu:281 cpu=1’
  2. last print is following
    of_get_named_gpio_flags: can’t parse gpios property

Note that the kernel image itself loads its beginning physical address at 0x81000000. Below 0x81000000 physical address is the initrd, and below this is module space. The total space below the kernel which comprises the initrd plus module space must not exceed 32MB. Although I had assumed modules were being pushed beyond 32MB via the initrd size, it looks like the initrd is trying to decompress and overwriting the lower kernel physical address.

I think my first suggestion actually went the wrong direction, as I was looking for module boundaries, but the issue is the initrd being too large. Modules too far out can’t branch to the kernel, initrd too large overwrites the kernel in the lower address space. Changing allowed initrd from 24MB to 16MB would actually give it less room when it was already overwriting with 24MB.

I do not know of any kernel configuration to give more than 24MB to initrd. Something may exist, but it seems you have no choice but to cut down the initrd size, and probably go back to the …LESS_24M configuration. I’m not sure but it may be possible to squeeze a bit more space out for initrd if you completely remove kernel module support (that’s a pretty big loss).

Thanks for reply.

note that we have loaded compressed kernel at 0x82000000 and dtb at 0x8300 0000 from uboot.

we have observed that we can boot same kernel if we use maxcpus=1 in kernel boot args.

now question is if decompression routine overwrites some data after 0x8100 0000 address then how come same kernel work when we pass maxcpus=1 in bootargs ?

I have not tried to load a kernel at an alternate address, so I do not know what issues might arise from that. However, does your kernel use modules? The limit on 32MB branch still applies, it’s a limit of the architecture’s branch instruction.

If you moved the kernel to a 1MB higher physical address as mentioned (0x82000000 - 0x81000000), I’m not sure how the module and initrd spaces would change, if at all. One possibility is that this “access window” would remain 32MB in size and move up with the kernel regardless of what the boot loader does (I do not know if the kernel is looking at relative address or absolute address where the issue occurs, nor do I know if the kernel cares what u-boot does), in which case nothing has changed.

Even if the initrd gains an extra MB it may still be insufficient, depending on the decompressed image size. If the issue is related to modules exceeding 32MB branch at some point, and if they have moved another MB away from the kernel base, then the modules will hit their branch limits earlier. This is hard to predict without a debugger because code more than 32MB away works fine if it was entered from closer than the 32MB limit, and if no code further out requires a branch, nothing will happen…I’ve seen the most issues when a module branches because of a spinlock.

How big is the decompressed image? Have you tried reducing the initrd size? Have you tried converting modules to integrated within the kernel in non-module format where possible? There simply isn’t a way to say why going to a single CPU changes the issue other than perhaps more code loads and run under multiple CPU.

Seeing how things change with non-module integrated code (instead of modules) and with reduced initrd size would help. What is in the initrd? Was this particular issue why the kernel base address was moved higher, or was there something else prompting the address move?

in our case zImage size is 8440568 bytes (including initramfs inside).

Reason for changing compressed kernel image load address to 0x82000000 is as follow
after debugging we found that after decompression size is 18547636 bytes and decompression starting from address 0x80008000 ends at 0x811B83B4. as this will overwrite compressed zimage. so we changed kernel load address from uboot to 82000000.

we also noticed that if we remove some components from initramfs then zImage size becomes lower than 8 MB (around 7.4MB) and this zImage works fine. we noticed in this case that decompression does not go beyond 0x81000000 address. so we came to conclusion that when decompression goes beyond address 0x81000000 then kernel is not booting.

To further conclude our finding, for working kernel with initramfs, now we have changed kernel decompression start address (zreladdr-y) and kernel start text address (textofs-y) to 0x80078000 so that decompression ends at 0x810003B4 (i.e. beyond 0x81000000). in this case also kernel does not boot(see logs we provided 7th july 2015 post).

we have working kernel without initramfs of size around 5 MB(compressed zImage size). this kernel boot is successful with decompression starting at 0x80008000 (decompression ends at 80C143B4). compressed kernel is loaded at address 0x82000000 from uboot.

Now just for testing, we changed zreladdr, textofs to 0x803E8000 and 0x03E8000. this kernel boot is successful (in this case decompression ends at 80FF43B4 i.e. below 0x81000000).

then we changed zreladdr, textofs to 0x803F8000 and 0x003F8000. this kernel does not boot (in this case decompression ends at 0x810043B4 i.e. above 0x81000000). kernel hangs during second CPU bootup(same as logs provided on 7 july 2015). in this case if we change bootargs to set maxcpus=1 then kernel boot is successfull.

so even with smaller kernel size, there seems to be some issue if kernel decompression goes beyond address 0x81000000 and maxcpus is greater than 1.

Quite some time back there was a long discussion by kernel developers about ARMv7 module/initrd loading and the limitations of the 32MB signed branch limitation. The gist of the conversation is that making a larger space was possible, but that it was horribly complicated and ugly and inefficient. Those developers went to great lengths to find an elegant alternative, but eventually decided that no such “elegant” solution (nor even a “reasonable” solution) existed. So I believe that there is a lot going on in the kernel such that simply moving the kernel start address around will not help for space limitation solutions.

It sounds like you can work around the issue with initrd content adjustments, and even though this is probably not ideal for you, the only alternative which I can see is to understand how the “…_LESS_16M/24M” option works and modify to provide even less memory to modules in order to gain initrd space (i.e., introduce yet another option such as “…LESS_30M” to leave 2MB for modules and 30MB for initrd). Even if you did this I don’t know if you’d end up with even more issues because of insufficient module space. When a module straddles that 32MB boundary there isn’t an outright failure, there is just odd kernel OOPS or error messages when something like a spinlock decides it wants to do a branch between the two blocks of code.

Pratik,

Can you check your .config if below config is enabled? AFAIK, there is no IO mapped @ 0x81000000 and it looks like a memory corruption issue.
CONFIG_INITRAMFS_COMPRESSION_GZIP=y

bbasu,

Right now for testing we are not using initramfs. we have normal kernel zImage that we are using for boot.

as per last my post, we changed zreladdr, textofs to 0x803F8000 and 0x003F8000. this kernel does not boot (in this case decompression ends at 0x810043B4 i.e. above 0x81000000). kernel hangs during second CPU bootup(same as logs provided on 7 july 2015).

that means even normal kernel can not boot if decompression start address is changed AND multiple CPU support is enabled. same kernel can boot if maxcpus=1.

it seems like there is issue with multiple CPU only.

following is the kernel configuration we are using …
and size of initramfs source CONFIG_INITRAMFS_SOURCE="/home/tegra/initramfs.cpio" is 15MB and CONFIG_INITRAMFS_COMPRESSION_GZIP is enabled in config. resulting zImage size is 12MB.
config.log (115 KB)

Hi Pratik,

Please test with this patch which should fix the issue

diff --git a/arch/arm/mach-tegra/headsmp.S b/arch/arm/mach-tegra/headsmp.S
index 5558510…2115210 100644
— a/arch/arm/mach-tegra/headsmp.S
+++ b/arch/arm/mach-tegra/headsmp.S
@@ -50,12 +50,13 @@
*/
__CPUINIT
ENTRY(tegra_secondary_startup)

  • ldr r0, =tegra_with_secure_firmware
  • sub r0, #(PAGE_OFFSET - TEGRA_DRAM_BASE)
  • ldr r12, [r0]
  • bl __invalidate_cpu_state
  • ldr r0, =tegra_with_secure_firmware
  • sub r0, #PAGE_OFFSET
  • ldr r0, [r0]
  • cmp r0, #1
  • cmp r12, #1 @ secure firmware present?
    beq secondary_startup

    /* enable user space perf counter access */
    @@ -85,10 +86,10 @@ ENDPROC(tegra_secondary_startup)

  • re-enabling sdram.
    */
    ENTRY(tegra_resume)
  • ldr r1, =tegra_with_secure_firmware
  • sub r1, #PAGE_OFFSET
  • ldr r1, [r1]
  • cmp r1, #1
  • ldr r0, =tegra_with_secure_firmware

  • sub r0, #(PAGE_OFFSET - TEGRA_DRAM_BASE)

  • ldr r12, [r0]

  • cmp r12, #1
    bne cpu_not_secure

    mov32 r1, TEGRA_TMRUS_BASE
    @@ -129,9 +130,7 @@ cpu_not_secure:

#ifdef CONFIG_CACHE_L2X0
#if !defined(CONFIG_ARCH_TEGRA_14x_SOC)

  • ldr r1, =tegra_with_secure_firmware
  • ldr r1, [r1]
  • cmp r1, #1
  • cmp r12, #1 @ secure firmware present?
    beq cpu_resume

    adr r0, tegra_resume_l2_init
    @@ -240,10 +239,7 @@ __invalidate_cpu_state:
    teq r1, r0
    beq cortex_a9

  • ldr r0, =tegra_with_secure_firmware
  • sub r0, #PAGE_OFFSET
  • ldr r0, [r0]
  • cmp r0, #1
  • cmp r12, #1 @ secure firmware present?
    beq __enable_i_cache_branch_pred

    mrc p15, 0x1, r0, c15, c0, 3 @ L2 prefetch control reg
    @@ -391,10 +387,9 @@ ENTRY(__tegra_cpu_reset_handler)
    b .
    #endif

  • ldr r0, =tegra_with_secure_firmware
  • sub r0, #PAGE_OFFSET
  • ldr r0, [r0]
  • cmp r0, #1
  • adr r12, __tegra_cpu_reset_handler_data

  • ldr r7, [r12, #RESET_DATA(SECURE_FW_PRESENT)]

  • cmp r7, #1 @ if !secure
    beq cpu_is_secure

    cpsid aif, 0x13 @ SVC mode, interrupts disabled

Hi bbasu,

your patch is working. thanks a lot. now we are able to boot board with multiple CPU also.

I can see you changed secondary CPU startup script but what exact change you did in above patch for my understanding ?