Processor bugs with KVM on Nvidia Carmel

Hello,

The qemu-system-arm and qemu-efi-aarch64 packages from the Ubuntu repositories were used here.

When enabling KVM (with VHE on) and trying to run guests, I can distinguish a couple of notable issues off my head right now:

qemu-system-aarch64 -enable-kvm -M virt -cpu host -m 4096 -hda miniroot67.fs -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd -device ramfb

Unexpected behaviour happens, that is an SError. ( Ubuntu Pastebin )

Window dragging even seems to be very slow during the boot of the instance, with that being abnormally slow too. The whole host slows down during that time.

Hopefully this will help to reproduce the issues on NVIDIA’s side.

I use the latest linux-4.9 tree from nv-tegra.nvidia.com with those patches on top: Xavier-KVM/patches/kernel at master · b-man/Xavier-KVM · GitHub .

Thank you,

Please refer to below topic to see if can help:

Hello,

Linux AArch64 guests work relatively fine… it’s when you go off that happy path that things become unexpected. (even Linux AArch32 triggers the second issue that I described, which also entails:

hrtimer: interrupt took 6042437 ns

(or even more) in the kernel logs.

Thank you,

An update:

  • back porting the user-mode irqchip patches and then using kernel-irqchip=off makes the SError not present for OpenBSD. I suspect that it’s an MSR handling issue for the timers (or a GIC handling bug?)

  • However, using a user-mode irqchip doesn’t have an impact on the whole desktop lagging when the VM is on boot or has I/O activity issue.

Hopefully that helps,

About this:

Are timers switching between 32-bit and 64-bit such that there is an overflow or number format issue? Imagine the timer is still working correctly, but the data does not fit in the register the timer uses. This would mean the timer does not really know the correct time due to the register, and not due to an actual long IRQ time.

Just for fun:

…I tell the doctor it hurts when I go off the happy path…so the doctor says “then don’t do that!”. I hate it when I go off the happy path… :P

Hello,

The whole desktop on the host becomes painfully slow in that case, at least when there’s I/O activity in the guest… it becomes unusable. And that IRQ time is really true sadly…

Didn’t explicitly note it, but that bug affects arm64 Windows too (which I’d like to run without completely making the system slow as hell), in addition to arm64 FreeBSD and AArch32 Linux guests…

I am not familiar with the bug, but if ARM has an errata on it, then you might post a URL to it here. However, no matter how well it works, any 32-bit mode on a 64-bit ARM CPU will be terribly slow. The question is if it slows down in ways not expected when in 64-bit mode.