750M not supported by current driver contrary to release notes

As listed here: http://us.download.nvidia.com/XFree86/Linux-x86_64/355.11/README/supportedchips.html
The 750M with PCI ID 0fe4 should be supported by current drivers, but the last driver that worked with my current laptop’s card is 352.21 (I skipped a few in between, so I don’t know when exactly it stopped working).

$ lspci -n |grep 0fe4
01:00.0 0302: 10de:0fe4 (rev a1)
$ lspci |grep 01:00.0
01:00.0 3D controller: NVIDIA Corporation GK107M [GeForce GT 750M] (rev a1)

Did I miss something there?

[   35.420747] NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:0fe4)
               NVRM: installed in this system is not supported by the 355.11
               NVRM: NVIDIA Linux driver release.  Please see 'Appendix
               NVRM: A - Supported NVIDIA GPU Products' in this release's
               NVRM: README, available on the Linux driver download page
               NVRM: at www.nvidia.com.

352.30 and 352.41 give the same error, and the driver download page apparently tells me to get the latter…

This sort of thing usually happens when some system power management thing cuts power to the GPU. Please try disabling any power saving options in the BIOS and Linux environment.

Doesn’t change a thing. (Running modprobe nvidia on a tty on a bare system with no power management (tlp, laptop-mode-tools, or otherwise) or desktop environment running (which usually come with some sort of power management as well), bbswitch wasn’t running either at the time (and it would cause different symptoms like the card refusing to go out of a particular power state, but this comes with a clear message in dmesg (and hasn’t happened in months).)

It would also be a weird symptom to appear with a driver update at module-load-time if it was a BIOS/EFI setting issue. (It doesn’t seem to have anything specifically for the gpu).

I’d expect it to not find the card, but not claim it is unsupported, if so then this should be detected and reported with a different error message.

Echoing 1 to the pci device’s reset,rescan,enable /sys/bus/pci/devices/0000:01:00.0 entries doesn’t have any effect either on the outcome, nor changing anything in its power/ subdirectory.

Are you sure this particular PCI id is supposed to be recognized by the driver code?

Yes, that device is in the list of supported devices. However, the driver doesn’t use that list directly. Instead, it queries the GPU to read some internal identification information and makes decisions based on that. If that read fails, then it can trigger this particular error. That’s why when this error shows up on a GPU that’s definitely supported, the first thing to look for is a system-level problem that is preventing communication between the driver and the GPU.

Guess I’ll try with different kernel versions, and maybe see what nouveau makes of it…
sigh, testing things which require reboots is so time-consuming and inconvenient (especially with blindfolds on (iow. without source code -_-))…

Okay, so I just compiled it against an older kernel and apparently it’s working with linux-4.1.2, but not with linux-4.2.2, that’s interesting (and frustrating).