The glxinfo output just shows you’re invoking the nvidia client gl on a mesa server glx. Incorrect setup, please delete that file.
Hi there. I have exactly the same issue as you (Legion 5 Pro 16IAH7H). Did you write a patch file to share? Did you manage to solve this issue?
I’m gonna send other outputs, when I’ll have a bit more time to test, removing 20-modesetting.conf (ie exactly the same as starting X again with nvidia-drivers running) give me a lot of issue (dmesg errors + freeze while trying for run some game to test).
@lorenzo.nizzi.grifi : no, not fixed, there is a workaround as you can see above, but it’s an ugly solution.
Nevermind, something interesting, you’re running a Lenovo and a xxIAH7H (I’ve a 15IAH7H) and I’m quite sure it’s the same hardware as mine. Can be a clue in order to get a fix.
Ugly workaround I do agree but is it working somehow? Still errors when you load games?
easy and fast thing that will fix, go to your bios settings and enable downgrade. then downgrade to the first bios version that your laptop had when your reseved it, then update it again to the latest available bios version. surely but not 100%, lenovo update or did something with the video card bios, and this will get your video bios back to its 0 state
Hello for being late in replying to this topic,
I can confirm the issue is still present in latest kernels (6.11).
@muhamadkateb7 : not really sure it’s feasible. Since in original BIOS was completly unusuable, even the BIOS menus were taking more than 30 minutes to refresh.
Hello again guys,
I’ve made a few more investigations…
I tried @muhamadkateb7 idea, it’s NOT working but… What I’ve done :
Downgrading VBIOS to it’s original : 94.06.34.00.26. BUT I got some warning message telling PCI Subsystem ID was different (Really strange, was previously updated using Lenovo software)
Then I downgrade BIOS to J2CN29WW which is the oldest one I could find.
In this case, VBIOS 00.26 + BIOS J2CN29WW, kernel 6.11 is working.
Here is nvidia-bug-report : nvidia-bug-report_6.11_DowngradeBiosVbios.log.gz (1.4 MB)
Then updated BIOS to latest version : J2CN57WW. Kernel 6.11 stopped working.
Some older kernels is also giving some Kernel OOPS Panic (didn’t happened with newer VBIOS 00.2F)
nvidia-bug-report_6.11_VBiosDown_BiosUP.log.gz (1.3 MB)
Also the ugly workaround with pcie_failed_link_retrain is working in 6.11.
Hope this can help finding out the issue
I can do also more tests during the next days.
Any help would be greatly appreciated.
Thanks for your tests. I guess unfortunately this Lenovo model (despite all BIOS updates or downgrades) is somehow incompatible with nvidia under linux. I guess a kernel regression or something like that. Did you create a kernel patch for the ugly workaround?
I directly patched source code for pcie_failed_link_retrain(struct pci_dev *dev), with return true as first line of this function. See some messages above.
We seriously need to find some patch with VBIOS/BIOS/Nvidia drivers update…
It seems that the latest BIOS () and Kernel 6.14 fixed the issue finally? Can someone else confirm that?
Short update, the problem seems only to intermittently work with the latest BIOS, kernel and Nvidia driver, but this appears to be more of a workaround than an actual fix.
After further investigation, I believe the issue is most likely related to the BIOS rather than the Nvidia driver or kernel itself. I have tested all BIOS versions, and I have confirmed that the issue first appeared starting from BIOS version J2CN40WW.
Changelog for J2CN40WW:
- (Fixed) Enhancement to address security vulnerability LEN-73440/73442
- (Fix) Securely close DPTF tool interface.
- (Fixed) Disable WMI by securely setting BIOS variables.
- (Fix) Enhancement to address security vulnerability LEN-73440/73442.
Based on this changelog, I suspect that the disabling of WMI and the secure setting of BIOS variables might be causing issues with the dGPU functionality, specifically with how PCIe and Resizable BAR (rBAR). The lack of WMI support might prevent proper hardware initialization or communication between the BIOS and Linux kernel, leading to the failure of the dGPU.
Unfortunately, this bug may not be easily fixed by Lenovo unless escalated through TAC. However, a practical solution for now would be to downgrade to BIOS version J2CN37WW which resolves the issue and restores dGPU functionality fully under Linux. It’s not the ideal fix, but it will work.
Hope this helps anyone facing the same issue!
All BIOS Version can be found on lenovo cn site, just google J2CN29WW lenovo cn
Ok that’s what I’ve always thought about it. Many thanks for all your extensive testing. Is it safe to downgrade to that bios? When it comes to bios I am always pretty conservative. Don’t break what it works (I can play under windows). I have a Lenovo pro 5 2022 (a model pretty similar to yours)
