Illegal instruction with I-cache disabled

We are porting the seL4 microkernel to Xavier NX. To support loading the ELF binaries, the program is copied onto correct place, then we clean the data cache + invalidate the instruction cache with sequence like this:

dc cvau, x0
dsb ish
ic ivau, x0
dsb ish
isb

Where x0 is iterated over the virtual addresses. Question #1: When we jump to loaded code with I-cache disabled (SCTLR_EL1.I == 0), we get illegal instruction exception. With the I-cache enabled (SCTLR_EL1.I == 1), code runs OK. Why is this?

Question #2: With I-cache enabled, our project runs somehow, but halts completely at random places. Even the Lauterbach debugger cannot connect to Carmel CPUs anymore. We realize the Carmel cores are NVIDIA proprietary design with ARMv8.2 front-end. The documentation says that the internal translator stores the micro-ops to system RAM and proper carveouts are needed. We found out how CBOOT passes available memory to Linux via DTB, punching the carveouts in the DRAM memory map. We suppose it is enough to use only the free memory areas listed in the DTB in order to avoid clashes with the aforementioned translation mechanism. Is that true?

Last but not least, Question #3: it seems when we repeatedly try to boot our seL4 project, it repeatedly gets better (or worse, depending on the test) for subsequent retries, suggesting that it has something to do with the die temperature / voltages / frequencies. As we do not run Linux, we do not have any code doing DVFS or thermal throttling. However, we understood from the L4T BSP that BPMP does that and the stability of CCPLEX is not dependent whether or not you run Linux with all its DVFS drivers and things like that. Is this true? How can our perceived temperature dependency be explained?

I have forwarded your question to internal BSP team to see if they can share expericnes.

Thank you. Could you please also tell them that we solved the question #1 – we were experimenting with all the stuff we found from Xavier TRM, including NV_Cache_CLEAN_EL1 / NV_Cache_INVAL_DATA_EL1 / NV_Cache_INVAL_ALL_EL1 system registers which operate on CCPLEX level. It appears a) they are not needed in our use case, b) we had used them in the wrong place and they caused the trouble. Our code is now working fine with I-cache disabled. So only questions #2 and #3 remain.

Hi,

Regarding temperature,
you can see the temperature info from BPMP console
cat soctherm/group_CPU/temp

dram freq tuning is done by BPMP and you can see the stats from the below node.
cat emc/stats

but CPU freq will be mostly fixed at 1.1Ghz since kernel cpufreq is not active.

regards
Bibek

Also, please note that cboot also runs out of ccplex and boots on single core.
You can refer to cboot code which is available with jetpack.
Are you booting on single core?

Could you share why you want to boot with I-Cache disabled? and what benefit you are expecting?

We just noted that our seL4 test suite proceeds farther without I-cache enabled and the kernel running in EL1 does not hang. With I-cache enabled the lockup is complete and the Lauterbach cannot connect the CCPLEX cores anymore. Naturally it is our desire to have the I-cache enabled.

The seL4 kernel can be run in EL1 with user apps at EL0, or it could be run as a hypervisor in EL2, with user applications at EL1 and EL0. We also ran into very peculiar issue where a RAM page was mapped (meant to be accessible from EL1), data caches were cleaned and TLBs were invalidated. Yet the access to that page resulted in Data Abort, translation fault at level 3. And we dumped the page tables for that TTBRx_ELx setting and the translation should have succeeded. At this point we switched using seL4 at EL1/EL0 to using it at EL2/EL1 and this problem went away, suggesting there could be some stage-2 related MMU settings to be verified (CBOOT runs at EL2 and is supposed to turn MMU off, but we will need to investigate).

Thanks for your feedback. So I take it you are basically saying we should be fine without any voltage/frequency support code in our software running on CCPLEX cores?

And yes, we are running on single core only and have studied CBOOT sources quite a bit. Still pondering what the problem could be, the most interesting part is to exclude a need for any non-standard aarch64 programming practices. If you could shed light on that question, we would be grateful and happy to try to find the problem in our seL4 port.

CBoot turns MMU off before jumping to the kernel. Kernel boot at EL2 then switches to EL1 and performs isb. Is it possible to share your boot code or partial.

Yes, we will share our code soon as Open Source. We have Docker-based build environment ready (internally Ubuntu 20.10), we just need to make sure it is smooth enough for others to use. I will ping you as soon as we can release it.

Any update here, did you get it work and can you share the porting work? I’d be interested in running seL4 on this platform also.

You can find our port here: https://github.com/tiiuae/tii_sel4_build/tree/xaviernx/hardware/xaviernx

Unfortunately we have not made any progress on this. Note that there are several build configurations, one with all caches enabled, one with instruction cache disabled, one with L2 data cache disabled and one with both the instruction cache and L2 data cache disabled. The latter seems to get farthest, while the others just hang.