Upgraded my Dell XPS15 9560 to Ubuntu 19.10 and enabled driver 435. When now selecting Nvidia “performance mode” or “on-demand”, syslog gets flooded with
PCIE Bus Error: Severity=Corrected after booting into Ubuntu
I tried to get rid of with with adding the boot-parameter ‘pci=nomsi’ to grub:
Oops… unless I am seriously mistaken, I consider this a bad suggestion…
Switch off serious error reporting because we don’t want to see the errors and not solving the hardware related IRQ issues that cause the errors the first place?
Those are low-level bus errors, you can’t really fix them besides maybe upgrading the bios. I also wouldn’t know what those have to do with irqs. So turning off aer at least prevents log spam.
To make things clearer, nomsi is turning off msi and aer meaning it’s kind of a placebo, it doesn’t fix those errors, it’s just turning them off as well.
Are you sure about that? As far as I can tell, no-msi, turns off the message signalling interrupts hence the power peaking, but still allows error reporting for other PCI related issues.
Anyway, back to issue itself: good news is that it seems an Nvidia driver issue and nothing “deeper”… Bad news we cant use Nvidia anymore in Optimus laptops; will try to downgrade the driver…
When going back to the issue: are you saying that the 435 driver is behaving properly, the errors are nothing to care about, but we just should not log them? It’s for me counter intuitive, but I am far from a kernel programmer ))
Since it’s Severity=Corrected I’d just turn off aer. Of course it’s not like everything was fine, those errors always point towards quality problems with the mainboard or better said, add-ons like wifi-cards (m.2/mini-pcie) or sd-card readers. You didn’t post the full error message which would also show which device reports that bus error but I would bet that it’s not the nvidia gpu. Installing the nvidia driver could trigger this because it puts load on the pcie bus but it’s very unlikely the cause of it. Rather changes in the kernel.
To be ultimately sure you’ll have to revert to an earlier driver, of course.
Ah… thanks for the great qualification. You might have a point as I also upgraded the default Killer WiFi card to and Intel 9260 one recently to upgrade to Bluetooth 5… did not think of that at all…
nommconf also just disables aer, another placebo. Disabling aspm can indeed fix those errors but it is disabled on most notebooks anyway. Run
sudo dmesg |grep -i aspm
to see whether or not it is available.