550.78 won't compile on 6.10-rc1 due to GPL violations and removed follow_pfn()

Here we go again. The current stable 550.78 is not compiling against the latest release candidate of the Mainline Linux Kernel. For the open-kernel module there is a patch:

diff --git a/kernel/nvidia/os-mlock.c b/kernel/nvidia/os-mlock.c
index 46f99a1..b8f4100 100644
--- a/kernel/nvidia/os-mlock.c
+++ b/kernel/nvidia/os-mlock.c
@@ -30,11 +30,21 @@ static inline int nv_follow_pfn(struct vm_area_struct *vma,
                                 unsigned long address,
                                 unsigned long *pfn)
 {
-#if defined(NV_UNSAFE_FOLLOW_PFN_PRESENT)
-    return unsafe_follow_pfn(vma, address, pfn);
-#else
-    return follow_pfn(vma, address, pfn);
-#endif
+    int status = 0;
+    spinlock_t *ptl;
+    pte_t *ptep;
+
+    if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
+        return status;
+
+    status = follow_pte(vma, address, &ptep, &ptl);
+    if (status)
+        return status;
+    *pfn = pte_pfn(ptep_get(ptep));
+
+    // The lock is acquired inside follow_pte()
+    pte_unmap_unlock(ptep, ptl);
+    return 0;
 }
 
 /*!

A discussion you may find here: `follow_pfn()` is removed from kernel · Issue #642 · NVIDIA/open-gpu-kernel-modules · GitHub

With the closed-source kernel however we will have again GPL violations, similar to 6.8 kernel series:

  MODPOST /build/linux610-nvidia/src/NVIDIA-Linux-x86_64-550.78-no-compat32/kernel/Module.symvers
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol '__rcu_read_unlock'
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol 'follow_pte'

It would be much nicer from Nvidia to make known issues public and also patches available as soon as they exist to the public, so power-user and developers can use the latest Linux kernel to debug and give feedback in time, before the stable kernel hit user-land.

Yes, i reproduce this today with OpenSUSE and vanilla kernel or sunlight kernel.

470 series not working.

For workaround, i use this:

Kernel side:

  1. Patch:
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 340bbefe5f652..181965356d9cb 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -406,7 +406,7 @@ void __rcu_read_lock(void)
                WRITE_ONCE(current->rcu_read_unlock_special.b.need_qs, true);
        barrier();  /* critical section after entry code. */
 }
-EXPORT_SYMBOL_GPL(__rcu_read_lock);
+EXPORT_SYMBOL(__rcu_read_lock);
 
 /*
  * Preemptible RCU implementation for rcu_read_unlock().
@@ -431,7 +431,7 @@ void __rcu_read_unlock(void)
                WARN_ON_ONCE(rrln < 0 || rrln > RCU_NEST_PMAX);
        }
 }
-EXPORT_SYMBOL_GPL(__rcu_read_unlock);
+EXPORT_SYMBOL(__rcu_read_unlock);
 
 /*
  * Advance a ->blkd_tasks-list pointer to the next entry, instead

2.Patch:

diff --git a/mm/memory.c b/mm/memory.c
index d022c84c22080..d00f494c62f2f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6011,7 +6011,7 @@ int follow_pte(struct vm_area_struct *vma, unsigned long address,
 out:
        return -EINVAL;
 }
-EXPORT_SYMBOL_GPL(follow_pte);
+EXPORT_SYMBOL(follow_pte);
 
 #ifdef CONFIG_HAVE_IOREMAP_PROT
 /**

NVidia Driver side:

diff --git a/kernel/nvidia/os-mlock.c b/kernel/nvidia/os-mlock.c
index 46f99a1..b8f4100 100644
--- a/kernel/nvidia/os-mlock.c
+++ b/kernel/nvidia/os-mlock.c
@@ -30,11 +30,21 @@ static inline int nv_follow_pfn(struct vm_area_struct *vma,
                                 unsigned long address,
                                 unsigned long *pfn)
 {
-#if defined(NV_UNSAFE_FOLLOW_PFN_PRESENT)
-    return unsafe_follow_pfn(vma, address, pfn);
-#else
-    return follow_pfn(vma, address, pfn);
-#endif
+    int status = 0;
+    spinlock_t *ptl;
+    pte_t *ptep;
+
+    if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
+        return status;
+
+    status = follow_pte(vma, address, &ptep, &ptl);
+    if (status)
+        return status;
+    *pfn = pte_pfn(ptep_get(ptep));
+
+    // The lock is acquired inside follow_pte()
+    pte_unmap_unlock(ptep, ptl);
+    return 0;
 }
 
 /*!

And driver is started now:

sudo dmesg | grep -E "NV|vmlinuz" 
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.10.0-rc1-x64zen4+ root=UUID=a7528a1d-9d96-4917-87e7-63a3c2870243 nouveau.modeset=0 clocksource=tsc tsc=reliable splash resume=/dev/disk/by-uuid/79c14cbd-bdd5-48d9-b6ab-30d060ac0cd7 mitigations=auto quiet security=apparmor nosimplefb=1
[    0.000000] BIOS-e820: [mem 0x000000000a200000-0x000000000a20efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000ec253000-0x00000000ec54cfff] ACPI NVS
[    0.000000] reserve setup_data: [mem 0x000000000a200000-0x000000000a20efff] ACPI NVS
[    0.000000] reserve setup_data: [mem 0x00000000ec253000-0x00000000ec54cfff] ACPI NVS
[    0.045954] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10.0-rc1-x64zen4+ root=UUID=a7528a1d-9d96-4917-87e7-63a3c2870243 nouveau.modeset=0 clocksource=tsc tsc=reliable splash resume=/dev/disk/by-uuid/79c14cbd-bdd5-48d9-b6ab-30d060ac0cd7 mitigations=auto quiet security=apparmor nosimplefb=1
[    0.046042] Unknown kernel command line parameters "splash BOOT_IMAGE=/boot/vmlinuz-6.10.0-rc1-x64zen4+", will be passed to user space.
[    0.288565] ACPI: PM: Registering ACPI NVS region [mem 0x0a200000-0x0a20efff] (61440 bytes)
[    0.288565] ACPI: PM: Registering ACPI NVS region [mem 0xec253000-0xec54cfff] (3121152 bytes)
[    0.336886] ACPI: \_SB_.PCI0.GPP6.P0NV: New power resource
[    1.738203]     BOOT_IMAGE=/boot/vmlinuz-6.10.0-rc1-x64zen4+
[    6.154214] nvidia: module license 'NVIDIA' taints kernel.
[    6.351520] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  470.239.06  Sat Feb  3 06:03:07 UTC 2024
[    6.463612] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  470.239.06  Sat Feb  3 06:03:51 UTC 2024
[    6.736055] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input19
[    6.736135] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input20
[    6.736245] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input21
[    6.736328] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input22

Removing GPL only marks in the kernel is against its license. Let"s see what the official solution by Nvidia will be.

The GPL restriction doesn’t apply for the open kernel module, so if that works properly for you, consider using it.

Now that kernel 6.10 is released, we are still missing a working patch.
In my case, nvidia-470.256.02 will not build the kernel modules; using the patch from the open-kernel module will result in the GPL-only symbol errors:

ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol '__rcu_read_unlock'
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol 'follow_pte'

Is there any workaround that does not involve modifying the kernel sources ?

Could it be that CVE-2024-38610 has impacted the release of a patch ? (I see follow_pte mentioned multiple times)

*edited to add CVE reference