Nvidia driver 590.48.01 fails to install on 6.19-rcX kernel

Hello,
I have an issue of installing latest Nvidia 590.48.01 drivers on Linux kernel 6.19-rcX. I tried every RC version from 1 to 5, all fail the same.

Can someone help me out with it, please?

nvidia-uvm/uvm_pmm_gpu.c:3063:6: error: ‘const struct dev_pagemap_ops’ has no member named ‘page_free’
3063 | .page_free = devmem_page_free,
| ^~~~~~~~~
nvidia-uvm/uvm_pmm_gpu.c:3063:18: error: initialization of ‘void (*)(struct folio )’ from incompatible pointer type ‘void ()(struct page *)’ [-Wincompatible-pointer-types]
3063 | .page_free = devmem_page_free,
| ^~~~~~~~~~~~~~~~
nvidia-uvm/uvm_pmm_gpu.c:3063:18: note: (near initialization for ‘uvm_pmm_devmem_ops.folio_free’)
nvidia-uvm/uvm_pmm_gpu.c:3002:13: note: ‘devmem_page_free’ declared here
3002 | static void devmem_page_free(struct page page)
| ^~~~~~~~~~~~~~~~
nvidia-uvm/uvm_pmm_gpu.c:3168:6: error: ‘const struct dev_pagemap_ops’ has no member named ‘page_free’
3168 | .page_free = device_coherent_page_free,
| ^~~~~~~~~
nvidia-uvm/uvm_pmm_gpu.c:3168:18: error: initialization of ‘void (
)(struct folio )’ from incompatible pointer type ‘void ()(struct page *)’ [-Wincompatible-pointer-types]
3168 | .page_free = device_coherent_page_free,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
nvidia-uvm/uvm_pmm_gpu.c:3168:18: note: (near initialization for ‘uvm_device_coherent_pgmap_ops.folio_free’)
nvidia-uvm/uvm_pmm_gpu.c:3161:13: note: ‘device_coherent_page_free’ declared here
3161 | static void device_coherent_page_free(struct page *page)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
CC [M] nvidia-uvm/uvm_range_tree_test.o
make[4]: *** [/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/scripts/Makefile.build:287: nvidia-uvm/uvm_pmm_gpu.o] Error 1
make[4]: *** Waiting for unfinished jobs…
nvidia-uvm/uvm_hmm.c: In function ‘fill_dst_pfn’:
nvidia-uvm/uvm_hmm.c:2143:9: error: too few arguments to function ‘zone_device_page_init’; expected 2, have 1
2143 | zone_device_page_init(dpage);
| ^~~~~~~~~~~~~~~~~~~~~
In file included from /media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/include/linux/mm.h:33,
from ././common/inc/nv-pgprot.h:30,
from ././common/inc/nv-linux.h:33,
from nvidia-uvm/uvm_linux.h:40,
from nvidia-uvm/uvm_common.h:43,
from nvidia-uvm/uvm_va_block_types.h:27,
from nvidia-uvm/uvm_hmm.h:29,
from nvidia-uvm/uvm_hmm.c:24:
/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/include/linux/memremap.h:227:6: note: declared here
227 | void zone_device_page_init(struct page *page, unsigned int order);
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/scripts/Makefile.build:287: nvidia-uvm/uvm_hmm.o] Error 1
make[3]: *** [/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/Makefile:2054: .] Error 2
make[2]: *** [/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5/Makefile:248: __sub-make] Error 2
make[2]: Leaving directory ‘/var/lib/dkms/nvidia/590.48.01/build’
make[1]: *** [Makefile:248: __sub-make] Error 2
make[1]: Leaving directory ‘/media/HDD-4TB/sdcafe/Kompajlirani_programi/KERNEL/linux-6.19-rc5’
make: *** [Makefile:138: modules] Error 2

2 Likes

Probably worth reporting to GitHub - NVIDIA/open-gpu-kernel-modules: NVIDIA Linux open GPU kernel module source

They don’t really accept issues related to release candidate kernels.

I also had problems installing the Nvidia binary driver 590.48.01 on Linux kernel 6.19 in my case rc8, on a Fedora 44 distro (freshly branched from the trunk). I was able to get the driver to compile and install by applying the changes to C files (but not the Makefiles) described in uvm: Fix build failure for Linux 6.19+ due to HMM and PMM API changes by gg582 · Pull Request #1015 · NVIDIA/open-gpu-kernel-modules · GitHub . I logged into my KDE desktop and ran some programs, including the nvidia settings utility (and it showed the GPU).

To do this you need to unpack the driver package (use the -x command-line argument), apply the patches described in the pull (I used a text editor, but you could use patch), then use nvidia-installer from the unpacked driver to install.

I would not recommend doing this unless you have some experience building code, otherwise you’ll be stuck if some command does not do what it was supposed to.

1 Like

I managed to get the 590.48.01 driver working on kernel 6.19.0. Three areas need patching.

The first patch is in kernel-open/nvidia-uvm/uvm_hmm.c. The function zone_device_page_init() changed its signature in 6.19 and now takes 3 arguments instead of 1. You need to add include <linux/version.h> after the first
include, then wrap the call at line 2143 with a #if LINUX_VERSION_CODE >= KERNEL_VERSION(6, 19, 0) guard, using zone_device_page_init(dpage, 0, 0) for 6.19+ and the original zone_device_page_init(dpage) for older kernels.

The second patch is in kernel-open/nvidia-uvm/uvm_pmm_gpu.c. The page_free callback was replaced by folio_free in struct dev_pagemap_ops. You need to add include <linux/version.h> and wrap all affected functions
(devmem_page_free, device_p2p_page_free, device_coherent_page_free) with version guards to provide folio_free variants on 6.19+. The same guards are needed on the three dev_pagemap_ops struct assignments. The complete patch
for this file is available in PR #1015 on the open-gpu-kernel-modules GitHub repository.

The third patch is in kernel-open/Kbuild and kernel/Kbuild. The pre-compiled binary blobs (nv-kernel.o_binary and nv-modeset-kernel.o_binary) were compiled without -mfunction-return=thunk-extern, which causes hundreds of
objtool errors like ‘naked’ return found in MITIGATION_RETHUNK build. The fix is to add two lines after the module Kbuild includes: $(obj)/nvidia.o: private objtool := true and $(obj)/nvidia-modeset.o: private objtool :=
true. This skips objtool validation on the composite module objects that contain the pre-compiled blobs.

After loading the driver you will see a runtime warning saying Unpatched return thunk in use. This is expected and harmless. It means the Spectre v2 rethunk mitigation does not cover the pre-compiled blob functions, but the
driver works normally.

I confirmed everything working with nvidia-smi on kernel 6.19.0, driver version 590.48.01, CUDA 13.1, on an NVIDIA GeForce RTX 3060/3070 Laptop GPU.

Credits to PR #1015 on the NVIDIA open-gpu-kernel-modules GitHub repository for the UVM patches.

1 Like

The patch from Peter Jung didn’t work for you?

  ---
  NVIDIA driver build fails with objtool: call without frame pointer save/setup on kernels using CONFIG_UNWINDER_FRAME_POINTER                                                                                                   
                  
  When building the NVIDIA driver on a custom kernel configured with CONFIG_UNWINDER_FRAME_POINTER=y, the build fails with multiple objtool errors like:                                                                         
                  
  nvidia.o: error: objtool: _nv049762rm+0x44: call without frame pointer save/setup
  nvidia.o: error: objtool: _nv028852rm+0x11: call without frame pointer save/setup
  ...

  The root cause is a combination of kernel config options:

  ┌───────────────────────────────┬──────────────────────────┬────────────────────────┐
  │            Option             │ Working kernel (default) │ Broken kernel (custom) │
  ├───────────────────────────────┼──────────────────────────┼────────────────────────┤
  │ CONFIG_FRAME_POINTER          │ not set                  │ =y                     │
  ├───────────────────────────────┼──────────────────────────┼────────────────────────┤
  │ CONFIG_UNWINDER_ORC           │ =y                       │ not set                │
  ├───────────────────────────────┼──────────────────────────┼────────────────────────┤
  │ CONFIG_UNWINDER_FRAME_POINTER │ not set                  │ =y                     │
  ├───────────────────────────────┼──────────────────────────┼────────────────────────┤
  │ CONFIG_OBJTOOL_WERROR         │ not set                  │ =y                     │
  ├───────────────────────────────┼──────────────────────────┼────────────────────────┤
  │ CONFIG_STACK_VALIDATION       │ not set                  │ =y                     │
  └───────────────────────────────┴──────────────────────────┴────────────────────────┘

  When the kernel uses the frame pointer unwinder instead of ORC, CONFIG_FRAME_POINTER=y is forced on. Combined with CONFIG_OBJTOOL_WERROR=y, objtool strictly validates that every function has proper frame pointer save/setup
  in its prologue. The NVIDIA proprietary binary blobs (_nvXXXXXXrm symbols) are not compiled with -fno-omit-frame-pointer, so they fail this validation.

  Workarounds:

  1. Switch to ORC unwinder (recommended) — set CONFIG_UNWINDER_ORC=y and CONFIG_UNWINDER_FRAME_POINTER=n. This is the default for most distributions and does not require frame pointers.
  2. Disable CONFIG_OBJTOOL_WERROR — set CONFIG_OBJTOOL_WERROR=n. This turns objtool errors into warnings, allowing the module to build.

  Ideal fix from NVIDIA's side: The proprietary blobs should be compiled with -fno-omit-frame-pointer so they pass objtool validation on kernels that use frame pointer unwinding. This is increasingly relevant as some
  real-time and low-latency kernel configurations prefer frame pointer unwinding for more reliable stack traces.

  ---

I suspect they’re aware of this and have a fix planned for the next 590 release, as there was a 580 release just last week that fixed only this issue:

  • Fixed kernel module build issue with Linux kernel v6.19.

fwiw I doubt there will be a new 590.x if 580.x was updated but 590.x still wasn’t, “new feature branches” are short lived and can stop receiving updates suddenly. Next will likely be 595/600 and that may take time.

Same issue just decreases the battery life by 7 hours