We are using jetson-tk1 board for the development.
We are using zImage with initramfs included in zImage itself. We are facing issue when zImage with initramfs size increases to more than around 7 MB. Kernel boot up hangs when it tries to power on other CPUs. If we add maxcpus=1 in bootargs then kernel boot up is ok but we get only one CPU.
Also if we reduce zImage size then kernel boot up is ok with multiple CPU. After debugging we found that issue happens when kernel decompression starting from address 0x80008000 and ends beyond address 0x8100 0000.
So it seems like our bigger zImage during compression overrides some data at address 0x81000000 and it causes boot up to hang. We are loading kernel at 0x82000000 and dtb at 0x83000000 from uboot.
Is there anything hardcoded in kernel for address 0x81000000 ?
The physical memory space where the kernel is initially loaded sits just above the space reserved for modules and initial ramdisk. The total size reserved for both spaces together is 32MB. A typical failure is when modules have 16MB module + 16MB initrd reserved (thus reserving 32MB total), but modules are larger than 16MB. You won’t usually get an outright failure, instead what you’ll see is that on some modules there is a spinlock or similar piece of code which has a branch instruction out of range error (initrd is placed between kernel lower address and modules, thus part of one or more modules can be out of range due to initrd size); unless the right options are on, you may not see this logged.
In your case it sounds like a critical block of code from a module is unreachable due to the 32MB branch instruction limitation.
If you need more module space (taking from initrd space) you’ll find the only option is to allocate a slightly different amount of space between modules and initrd, where the total space is always 32MB. The default kernel setting is via “CONFIG_TASK_SIZE_3G_LESS_24M” (versus the mutually exclusive “CONFIG_TASK_SIZE_3G_LESS_16M”). The default takes 32MB and reserves 24M for initrd, leaving modules with 8MB (32MB - 24MB); the mutually exclusive inverse option reserves 16M for initrd, thus allowing up to 16MB (32MB - 16MB) for modules.
Sorry, I don’t remember which menuconfig path gets to that option. Unless your initrd actually requires more than 16MB this is a fairly easy fix. If you don’t see the config menu option which deals with this I can probably find it…hand editing of the .config file isn’t advised, there may be other changes to configuration when switching between those two options. If you need more than 32MB total, you’re out of luck…the limitation is on the assembler branch instruction and would require massive re-writes to get larger spaces to work. A good work-around is to compile previously module-format features to become non-module/integrated (thus reducing module space total size required).
Note that the kernel image itself loads its beginning physical address at 0x81000000. Below 0x81000000 physical address is the initrd, and below this is module space. The total space below the kernel which comprises the initrd plus module space must not exceed 32MB. Although I had assumed modules were being pushed beyond 32MB via the initrd size, it looks like the initrd is trying to decompress and overwriting the lower kernel physical address.
I think my first suggestion actually went the wrong direction, as I was looking for module boundaries, but the issue is the initrd being too large. Modules too far out can’t branch to the kernel, initrd too large overwrites the kernel in the lower address space. Changing allowed initrd from 24MB to 16MB would actually give it less room when it was already overwriting with 24MB.
I do not know of any kernel configuration to give more than 24MB to initrd. Something may exist, but it seems you have no choice but to cut down the initrd size, and probably go back to the …LESS_24M configuration. I’m not sure but it may be possible to squeeze a bit more space out for initrd if you completely remove kernel module support (that’s a pretty big loss).
I have not tried to load a kernel at an alternate address, so I do not know what issues might arise from that. However, does your kernel use modules? The limit on 32MB branch still applies, it’s a limit of the architecture’s branch instruction.
If you moved the kernel to a 1MB higher physical address as mentioned (0x82000000 - 0x81000000), I’m not sure how the module and initrd spaces would change, if at all. One possibility is that this “access window” would remain 32MB in size and move up with the kernel regardless of what the boot loader does (I do not know if the kernel is looking at relative address or absolute address where the issue occurs, nor do I know if the kernel cares what u-boot does), in which case nothing has changed.
Even if the initrd gains an extra MB it may still be insufficient, depending on the decompressed image size. If the issue is related to modules exceeding 32MB branch at some point, and if they have moved another MB away from the kernel base, then the modules will hit their branch limits earlier. This is hard to predict without a debugger because code more than 32MB away works fine if it was entered from closer than the 32MB limit, and if no code further out requires a branch, nothing will happen…I’ve seen the most issues when a module branches because of a spinlock.
How big is the decompressed image? Have you tried reducing the initrd size? Have you tried converting modules to integrated within the kernel in non-module format where possible? There simply isn’t a way to say why going to a single CPU changes the issue other than perhaps more code loads and run under multiple CPU.
Seeing how things change with non-module integrated code (instead of modules) and with reduced initrd size would help. What is in the initrd? Was this particular issue why the kernel base address was moved higher, or was there something else prompting the address move?
in our case zImage size is 8440568 bytes (including initramfs inside).
Reason for changing compressed kernel image load address to 0x82000000 is as follow
after debugging we found that after decompression size is 18547636 bytes and decompression starting from address 0x80008000 ends at 0x811B83B4. as this will overwrite compressed zimage. so we changed kernel load address from uboot to 82000000.
we also noticed that if we remove some components from initramfs then zImage size becomes lower than 8 MB (around 7.4MB) and this zImage works fine. we noticed in this case that decompression does not go beyond 0x81000000 address. so we came to conclusion that when decompression goes beyond address 0x81000000 then kernel is not booting.
To further conclude our finding, for working kernel with initramfs, now we have changed kernel decompression start address (zreladdr-y) and kernel start text address (textofs-y) to 0x80078000 so that decompression ends at 0x810003B4 (i.e. beyond 0x81000000). in this case also kernel does not boot(see logs we provided 7th july 2015 post).
we have working kernel without initramfs of size around 5 MB(compressed zImage size). this kernel boot is successful with decompression starting at 0x80008000 (decompression ends at 80C143B4). compressed kernel is loaded at address 0x82000000 from uboot.
Now just for testing, we changed zreladdr, textofs to 0x803E8000 and 0x03E8000. this kernel boot is successful (in this case decompression ends at 80FF43B4 i.e. below 0x81000000).
then we changed zreladdr, textofs to 0x803F8000 and 0x003F8000. this kernel does not boot (in this case decompression ends at 0x810043B4 i.e. above 0x81000000). kernel hangs during second CPU bootup(same as logs provided on 7 july 2015). in this case if we change bootargs to set maxcpus=1 then kernel boot is successfull.
so even with smaller kernel size, there seems to be some issue if kernel decompression goes beyond address 0x81000000 and maxcpus is greater than 1.
Quite some time back there was a long discussion by kernel developers about ARMv7 module/initrd loading and the limitations of the 32MB signed branch limitation. The gist of the conversation is that making a larger space was possible, but that it was horribly complicated and ugly and inefficient. Those developers went to great lengths to find an elegant alternative, but eventually decided that no such “elegant” solution (nor even a “reasonable” solution) existed. So I believe that there is a lot going on in the kernel such that simply moving the kernel start address around will not help for space limitation solutions.
It sounds like you can work around the issue with initrd content adjustments, and even though this is probably not ideal for you, the only alternative which I can see is to understand how the “…_LESS_16M/24M” option works and modify to provide even less memory to modules in order to gain initrd space (i.e., introduce yet another option such as “…LESS_30M” to leave 2MB for modules and 30MB for initrd). Even if you did this I don’t know if you’d end up with even more issues because of insufficient module space. When a module straddles that 32MB boundary there isn’t an outright failure, there is just odd kernel OOPS or error messages when something like a spinlock decides it wants to do a branch between the two blocks of code.
Right now for testing we are not using initramfs. we have normal kernel zImage that we are using for boot.
as per last my post, we changed zreladdr, textofs to 0x803F8000 and 0x003F8000. this kernel does not boot (in this case decompression ends at 0x810043B4 i.e. above 0x81000000). kernel hangs during second CPU bootup(same as logs provided on 7 july 2015).
that means even normal kernel can not boot if decompression start address is changed AND multiple CPU support is enabled. same kernel can boot if maxcpus=1.
it seems like there is issue with multiple CPU only.
following is the kernel configuration we are using …
and size of initramfs source CONFIG_INITRAMFS_SOURCE="/home/tegra/initramfs.cpio" is 15MB and CONFIG_INITRAMFS_COMPRESSION_GZIP is enabled in config. resulting zImage size is 12MB. config.log (115 KB)