465.31 Nvidia driver with Clang LTO enabled fails to load with 5.13

Compiling the Nvidia kernel modules of 465.31 with Clang LTO enabled under 5.12 works fine. The kernel as well as the Nividia modules were compiled with Clang LTO enabled. The modules load as expected under the active kernel with no known issues.

With the new 5.13 kernel does it produce depmod warnings and the modules will not load under the active kernel:

nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_rax
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_rcx
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_rdx
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_r10
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_r9
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_jmp_rsi
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rax
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r15
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r11
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rsi
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r9
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r8
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r10
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rdx
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rcx
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r12
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_r13
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rbx
nvidia-modeset.ko needs unknown symbol __x86_indirect_alt_call_rbp
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rax
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r8
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r10
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rdx
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rcx
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r9
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rdi
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r11
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rsi
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r13
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_rbx
nvidia.ko needs unknown symbol __x86_indirect_alt_jmp_r12
nvidia.ko needs unknown symbol __x86_indirect_alt_call_rax
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r9
nvidia.ko needs unknown symbol __x86_indirect_alt_call_rdx
nvidia.ko needs unknown symbol __x86_indirect_alt_call_rcx
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r13
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r10
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r8
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r12
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r11
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r14
nvidia.ko needs unknown symbol __x86_indirect_alt_call_r15
nvidia.ko needs unknown symbol __x86_indirect_alt_call_rbx
nvidia.ko needs unknown symbol __x86_indirect_alt_call_rsi

2 Likes

Update:

The issue is reproducible with driver 470.57.02 as well.

The 470 driver’s kernel modules compile with full Clang LTO enabled and load under kernel 5.12.19 without issues.

With kernel 5.13 does the 470 driver produce the above mentioned warnings and the driver’s kernel modules fail to load.

Update:

The issue can be reproduced with driver 470.63.01.

1 Like

Same with kernel 5.14.13 and nvidia-drivers 495; really wish this would get fixed.

1 Like

This is happening due to how objtool unconditionally replaces retpoline calls during kernel/module compilation even if the kernel was built with no retpoline support.
Last time I checked, some changes to its behavior, as well as overall thin LTO compilation process, were being proposed and should probably address this issue as well.

In the meantime, this should make it possible to compile nv modules with clang and thin LTO:

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e5947fbb9e7a..d8e77c24718b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1858,9 +1858,11 @@ static int decode_sections(struct objtool_file *file)
 	 * alternatives. Must be after add_{jump,call}_destination(), since
 	 * those create the call insn lists.
 	 */
-	ret = arch_rewrite_retpolines(file);
-	if (ret)
-		return ret;
+	if (retpoline) {
+		ret = arch_rewrite_retpolines(file);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }

Thank you for this workaround.

Still reproducible with nvidia-drivers 495.46 - kernel 5.15.7 (-flto=thin)

Also getting some new stack validation warning compared to a non-lto build :

/var/lib/dkms/nvidia/495.46/build/nvidia.lto.o: warning: objtool: _nv019098rm()+0x68: stack state mismatch: reg1[5]=-1+0 reg2[5]=-2-48
/var/lib/dkms/nvidia/495.46/build/nvidia.lto.o: warning: objtool: _nv026765rm()+0x132: return with modified stack frame
/var/lib/dkms/nvidia/495.46/build/nvidia.lto.o: warning: objtool: _nv033141rm()+0xc1: return with modified stack frame
/var/lib/dkms/nvidia/495.46/build/nvidia.lto.o: warning: objtool: _nv011684rm()+0xe4: stack state mismatch: reg1[5]=-1+0 reg2[5]=-2-64

Not sure if I should worry about that …

I have no idea whether there is a formal way of checking the built driver’s binary correctness. All I can say is that it seems to be running fine to the extent I’m using it.

Note that the aforementioned warnings refer to symbols originating from a precompiled nvidia/nv-kernel.o_binary object that is part of the driver package. The same object that is compiled with retpoline support that normally freaks out objtool.
Perhaps it’s not playing nicely with some of clang’s linking steps. Haven’t looked any further into that.