IC IALLUIS inordinately expensive on Xavier (virtual machine scenario)

On Xavier, execution of the IC IALLUIS instruction is inordinately expensive. This poses significant constraints when running with virtual machines, as they can drive host performance down massively.

As such, we can’t currently guarantee remotely consistent host performance due to this issue. Virtual machines aren’t supposed to be able to massively slow down the host or to be able to crash it…

Test scenario (as an example, FreeBSD is a guest operating system that does not do selective instruction cache invalidates, relying on IC IALLU[IS] instead):
wget https://download.freebsd.org/releases/VM-IMAGES/13.0-RELEASE/aarch64/Latest/FreeBSD-13.0-RELEASE-arm64-aarch64.qcow2.xz xz -d FreeBSD-13.0-RELEASE-arm64-aarch64.qcow2.xz qemu-system-aarch64 -M virt -cpu host -accel kvm -m 2048 -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd -hda FreeBSD-13.0-RELEASE-arm64-aarch64.qcow2 -device ramfb -device nec-usb-xhci -device usb-kbd

And try to do action including trying to drag a window during virtual machine boot up.

I don’t see any way for us to work around it on the software side…

Test environment: JetPack 5.0 DP with Ubuntu 20.04, also applicable to earlier releases.

Our team will do the investigation and provide suggestions soon. Thanks

The problem seems specific to ARM port of Free BSD.
If you are talking about the below code, then please try to comment and uncomment the below code highlighted in bold for PIPT Instruction Cache.

https://cgit.freebsd.org/src/tree/sys/arm64/arm64/cpufunc_asm.S#n146
ENTRY(arm64_aliasing_icache_sync_range)
/*

  • XXX Temporary solution - I-cache flush should be range based for
  • PIPT cache or IALLUIS for VIVT or VIPT caches
    */

/* cache_handle_range dcop = cvau, ic = 1, icop = ivau */ // Uncomment this Line
cache_handle_range dcop = cvau // Comment this Line
ic ialluis // Comment this Line
dsb ish
isb
ret
END(arm64_aliasing_icache_sync_range)

If the above doesn’t help then better to check with Free BSD support forums.
Also, try using other guest OS like Linux on KVM. That shouldn’t have such an issue.

Hello,

We indeed have fixes on the FreeBSD side developed. The worrying part isn’t that. The workaround on the FreeBSD side is to issue selective instruction cache invalidates as required, the Xavier TRM is not documented to guarantee icache coherency on Carmel.

The issue is that in any scenario whatsoever, a virtual machine can massively slow down the host.

In this scenario, the virtual machine is considered as hostile. As such, the mere fact that a guest OS can do this, no matter which code it runs, isn’t ok…

Or should we not expect these isolation/performance consistency guarantees for virtual machines on Xavier?

Thank you,

edit: handling different bugs separately…

“IC IALLUIS” is very expensive and should not be required on Xavier.

Hello,

We understand that. Let’s consider the scenario where you aren’t trusting what is running in the virtual machine (because it’s supposed to be a security boundary).

In that scenario, what should be done to have the slowdown itself confined to the virtual machine if IC IALLUIS cannot be made cheaper? (confining the slowdown to the contexts using a virtual machine’s VMID would be fine for us, as would be actually making IC IALLUIS faster)

(Also as a side note, AArch32 Linux VMs are also affected by this)

Thank you,

You can set ‘ACTLR_EL2.TRAP_IC_IALLU’ and then walk all the pages in the guest with ‘IC IVAU’ from the HV trap handler.

ACTLR_EL2 - Bit [4] : TRAP_IC_IALLU:
When set, an attempt to execute IC IALLU, IC IALLUIS, ICIALLU, or ICIALLUIS in non-secure EL1 trap to EL2.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.