I’m using a GTX 260M with openSUSE Tumbleweed, and i’m not being able to use the desktop since i updated the kernel ( from 4.8.8 to 4.20 ).
Tried different kernels without success from 4.12 and up to 5.0. From 4.12 to 4.18 i was not able to build the driver due to this error:
“Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel” nvidia legacy". Installed libelf-dev ( also the 32bit package ) but does not seem to help. I also encountered another issue building the driver, i was getting “module: nvidia: Unknown rela relocation: 4” when trying to load the kernel module, but i fixed downgrading the binutils package after reading [url]https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=908568[/url].
So i tried newer kernels until it compiled but i’m getting crashes on 4.20 and 5.0 ( with the driver patched to make it compile in the latest kernel ).
I’m able to use the driver without any issues in the 4.8.8 kernel ( but i’m having another issue, not related, that is a blocker to continue using that kernel version ).
nvidia-bug-report.sh crashes linux but at least with safe-mode i was able to recover the crash report.
Also i was able to capture the kernel errors when the system crahses.
Hi, i uploaded the crash from one of the many test i tried. I actually build first the module with GCC 8.3.1 too but i had the exact same issue. Anyways, here’s my crash log with the module built with GCC 8.3.1. nvidia-260m-crash.log (15.2 KB) nvidia-bug.report.log.gz (71.9 KB)
Ok, looking at the crashdump, at the beginning, there’s this:
Mar 23 15:49:36 spartan-nb kernel: pciehp 0000:00:01.0:pcie004: Slot(16): Card not present
Mar 23 15:49:36 spartan-nb kernel: pciehp 0000:00:01.0:pcie004: Slot(16): Card present
meaning, the pcie hotplug driver detects a removal and instant adding back of the gpu. The pcie-hp driver was rewritten for the 4.19 kernel so that seems to have added a bug in your case. Please try to disable it using the kernel parameter
pci=nopciehp
and if that helps, report a bug with your distro’s bug-tracker.
Sorry, reading the whole story it surfaced that this parameter was proposed but never actually implemented. So I don’t really know an easy way to disable hotplug without building a custom kernel. Don’t know if the pcie slot capability can be manipulated. Please post the output of:
No problem. For the moment i can continue using kernel 4.8.8, as i managed to fix my non-related problems that took me to upgrade the kernel ( i was having issues with Qt5 because of a change that breaks Qt on old kernels https://superuser.com/questions/1347723/arch-on-wsl-libqt5core-so-5-not-found-despite-being-installed ). The problem is that having an old kernel is not ideal, but i can deal with that.
Looks like indeed the bios incorrectly sets the Hotplug capability (+), it’s a notebook after all. Still doesn’t explain why the new pciehp driver freaks out. Maybe also check for a bios update and cross-check if the hotplug bit is also set when using the 4.8 kernel.
I checked on 4.8.8 and hot-plug is also enabled for the PCIe lane of the VGA. Also checked for a BIOS update, and there was one but didn’t seem to have any fix related, anyway i updated it.
Tried this, disabling the hot plug on the VGA. The system crashes, but it produces a different error as you could imagine. I’ll attach the crash log that i was able to capture. But this time doesn’t seem to clear what’s going on.