Nvidia driver installation fails

Nvidia driver fails to load.
sudo apt-get install nvidia-driver-525 doesnt work
did the purge and reinstalled manually but it fails. Attached is the log file.
–Machine: x86-64, L4 GPU, Deepstream 6.3, Cuda 12.0

nvidia-installer.log (63.3 KB)

Hi @amaunder,

The purge did not seem to have worked correctly:

-> An alternate method of installing the NVIDIA driver was detected. (This is usually a package provided by your distributor.) A driver installed via that method may integrate better with your system than a driver installed by nvidia-installer.

Please review the message provided by the maintainer of this alternate installation method and decide how to proceed:

The NVIDIA driver provided by Ubuntu can be installed by launching the "Software & Updates" application, and by selecting the NVIDIA driver from the "Additional Drivers" tab.

An the kernel module could not be compiled, which could have several reasons:

  • conflicting compiler versions
  • wrong or missing kernel headers
  • still loaded kernel modules

I recommend checking the README for this driver version on troubleshooting and installation recommendations.

Thanks!.

(1)I dont have access to the UI of the machine. Are their certain terminal commands that I can use?
(2) Is there a URL for troubleshooting and installation recommendations?

(3)If purge didnt happen? What do I need to make sure it happens successfully. I did do lsmod and saw no nvidia driver in the system. I also stopped gdm.

(4) Also what is the right way to install. I assume its: sudo apt-get install nvidia-driver-525. Perhaps I can use the .run file with the right driver.

I am sure you saw it but I am anyways cut-pasting it below so hopefully it saves you some time (just in case). The errors I see are:
ERROR: An error occurred while performing the step: “Building kernel modules”. See /var/log/nvidia-installer.log for details.
→ The command cd ./kernel; /usr/bin/make -k -j32 NV_EXCLUDE_KERNEL_MODULES="" SYSSRC="/lib/modules/5.15.0-124-generic/build" SYSOUT="/lib/modules/5.15.0-124-generic/build" failed with the following output:

make[1]: Entering directory ‘/usr/src/linux-headers-5.15.0-124-generic’
warning: the compiler differs from the one used to build the kernel The kernel was built by: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
You are using: cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

MODPOST /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol ‘rcu_read_unlock_strict’
make[2]: *** [scripts/Makefile.modpost:133: /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers] Error 1
make[2]: *** Deleting file ‘/tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers’
make[2]: Target ‘__modpost’ not remade because of errors.
make[1]: *** [Makefile:1829: modules] Error 2
make[1]: Leaving directory ‘/usr/src/linux-headers-5.15.0-124-generic’ make: *** [Makefile:82: modules] Error 2 → Checking to see whether the nvidia kernel module was successfully built executing: ‘cd ./kernel; /usr/bin/make -k -j32 NV_EXCLUDE_KERNEL_MODULES=“” SYSSRC=“/lib/modules/5.15.0-124-generic/build” SYSOUT=“/lib/modules/5.15.0-124-generic/build” NV_KERNEL_MODULES=“nvidia”’… make[1]: Entering directory ‘/usr/src/linux-headers-5.15.0-124-generic’
warning: the compiler differs from the one used to build the kernel The kernel was built by: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 You are using: cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 MODPOST /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol ‘rcu_read_unlock_strict’
make[2]: *** [scripts/Makefile.modpost:133: /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers] Error 1
make[2]: *** Deleting file ‘/tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers’
make[2]: Target ‘__modpost’ not remade because of errors.
make[1]: *** [Makefile:1829: modules] Error 2
make[1]: Leaving directory ‘/usr/src/linux-headers-5.15.0-124-generic’
make: *** [Makefile:82: modules] Error 2
→ Error. ERROR: An error occurred while performing the step: “Checking to see whether the nvidia kernel module was successfully built”. See /var/log/nvidia-installer.log for details. → The command cd ./kernel; /usr/bin/make -k -j32 NV_EXCLUDE_KERNEL_MODULES="" SYSSRC="/lib/modules/5.15.0-124-generic/build" SYSOUT="/lib/modules/5.15.0-124-generic/build" NV_KERNEL_MODULES="nvidia" failed with the following output: make[1]: Entering directory ‘/usr/src/linux-headers-5.15.0-124-generic’ warning: the compiler differs from the one used to build the kernel
The kernel was built by: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
You are using: cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
MODPOST /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol ‘rcu_read_unlock_strict’ make[2]: *** [scripts/Makefile.modpost:133: /tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers] Error 1 make[2]: *** Deleting file ‘/tmp/selfgz51327/NVIDIA-Linux-x86_64-525.125.06/kernel/Module.symvers’
make[2]: Target ‘__modpost’ not remade because of errors.
make[1]: *** [Makefile:1829: modules] Error 2
make[1]: Leaving directory ‘/usr/src/linux-headers-5.15.0-124-generic’
make: *** [Makefile:82: modules] Error 2
ERROR: The nvidia kernel module was not created.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

Below are the terminal commands I used and it seems to have started nvidia-smi

ubuntu-drivers devices
sudo apt install nvidia-driver-535-server-open