Kernel module was not rebuilt when updated kernel was installed on RHEL6

We have an ongoing problem where users who have installed their laptops with the Nvidia drivers get a hung system following a kernel update.

I believe there is something not quite right with the DKMS setup on these machines. For example, on my system, here is the log from my last install of NVidia drivers

nvidia-installer log file ‘/var/log/nvidia-installer.log’
creation time: Fri May 16 12:38:46 2014
installer version: 331.67

PATH: /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ibm/c4eb/bin:/usr/kerberos/bin:/root/bin:/opt/websm/bin

nvidia-installer command line:
./nvidia-installer

Using: nvidia-installer ncurses user interface
→ Tagging shared libraries with chcon -t textrel_shlib_t.
→ License accepted.
→ Installing NVIDIA driver version 331.67.
→ There appears to already be a driver installed on your system (version: 331.67). As part of installing this driver (version: 331.67), the existing driver will be uninstalled. Are you sure you want to continue? (‘no’ will abort installation) (Answer: Yes)
→ Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. (Answer: Yes)
→ Installing both new and classic TLS OpenGL libraries.
→ Installing both new and classic TLS 32bit OpenGL libraries.
→ Install NVIDIA’s 32-bit compatibility libraries? (Answer: Yes)
→ Parsing log file:
→ done.
→ Validating previous installation:
→ done.
→ Uninstalling NVIDIA Accelerated Graphics Driver for Linux-x86_64 (1.0-33167 (331.67)):
→ done.
→ Uninstallation of existing driver: NVIDIA Accelerated Graphics Driver for Linux-x86_64 (331.67) is complete.
→ nvidia-installer will install the libvdpau and libvdpau_trace libraries that were included with this installer package. These libraries are available separately through the libvdpau project and will be removed from the NVIDIA Linux driver installer package in the future, so it is recommended that VDPAU users install libvdpau separately, e.g. by using packages available from their distributions, or by building from the sources available at:

http://people.freedesktop.org/~aplattner/vdpau
→ Searching for conflicting X files:
→ done.
→ Searching for conflicting OpenGL files:
→ done.
→ Installing ‘NVIDIA Accelerated Graphics Driver for Linux-x86_64’ (331.67):
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-glcore.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-glcore.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libGL.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGL.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/xorg/modules/extensions/libglx.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/extensions/libglx.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/tls/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/tls/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/xorg/modules/drivers/nvidia_drv.so’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/drivers/nvidia_drv.so’…
executing: ‘/usr/bin/execstack -c /usr/lib64/xorg/modules/libnvidia-wfb.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/libnvidia-wfb.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-ml.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-ml.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-cfg.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-cfg.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-fbc.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-fbc.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libcuda.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libcuda.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-opencl.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-opencl.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libOpenCL.so.1.0.0’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libOpenCL.so.1.0.0’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-compiler.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-compiler.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libvdpau.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libvdpau.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/vdpau/libvdpau_trace.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/vdpau/libvdpau_trace.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/vdpau/libvdpau_nvidia.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/vdpau/libvdpau_nvidia.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvcuvid.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvcuvid.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-encode.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-encode.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib64/libnvidia-ifr.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-ifr.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libcuda.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libcuda.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-ml.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-ml.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libOpenCL.so.1.0.0’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libOpenCL.so.1.0.0’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-compiler.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-compiler.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-opencl.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-opencl.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libGL.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libGL.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-glcore.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-glcore.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/tls/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/tls/libnvidia-tls.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libvdpau.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libvdpau.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/vdpau/libvdpau_trace.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/vdpau/libvdpau_trace.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/vdpau/libvdpau_nvidia.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/vdpau/libvdpau_nvidia.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvcuvid.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvcuvid.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-encode.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-encode.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-eglcore.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-eglcore.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-glsi.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-glsi.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libEGL.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libEGL.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libGLESv2.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libGLESv2.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libGLESv1_CM.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libGLESv1_CM.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-ifr.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-ifr.so.331.67’…
executing: ‘/usr/bin/execstack -c /usr/lib/libnvidia-fbc.so.331.67’…
executing: ‘/usr/bin/chcon -t textrel_shlib_t /usr/lib/libnvidia-fbc.so.331.67’…
executing: ‘/sbin/ldconfig’…
executing: ‘/sbin/depmod -aq’…
→ done.
→ Driver file installation is complete.
→ Installing DKMS kernel module:
→ done.
→ Running post-install sanity check:
→ done.
→ Post-install sanity check passed.
→ Shared memory test passed.
→ Running runtime sanity check:
→ done.
→ Runtime sanity check passed.
→ Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up. (Answer: Yes)
→ Your X configuration file has been successfully updated. Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 331.67) is now complete.

However, I watch the output from a yum update of the kernel and got this

Downloading Packages:
(1/4): kernel-2.6.32-431.21.1.el6.x86_64.rpm | 28 MB 00:00
(2/4): kernel-devel-2.6.32-431.21.1.el6.x86_64.rpm | 8.8 MB 00:00
(3/4): kernel-firmware-2.6.32-431.21.1.el6.noarch.rpm | 13 MB 00:00
(4/4): kernel-headers-2.6.32-431.21.1.el6.x86_64.rpm | 2.9 MB 00:00

Total 45 MB/s | 52 MB 00:01
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Updating : kernel-firmware-2.6.32-431.21.1.el6.noarch 1/8
Updating : kernel-2.6.32-431.21.1.el6.x86_64 2/8
Updating : kernel-headers-2.6.32-431.21.1.el6.x86_64 3/8
Updating : kernel-devel-2.6.32-431.21.1.el6.x86_64 4/8
Cleanup : kernel-2.6.32-431.17.1.el6.x86_64 5/8
dkms: removing: nvidia 331.67 (2.6.32-431.17.1.el6.x86_64) (x86_64)

-------- Uninstall Beginning --------
Module: nvidia
Version: 331.67
Kernel: 2.6.32-431.17.1.el6.x86_64 (x86_64)

Status: Before uninstall, this module version was ACTIVE on this kernel.
Removing any linked weak-modules

nvidia.ko:

  • Uninstallation
    • Deleting from: /lib/modules/2.6.32-431.17.1.el6.x86_64/extra/
  • Original module
    • No original module was found for this module on this kernel.
    • Use the dkms install command to reinstall any previous module version.

nvidia-uvm.ko:

  • Uninstallation
    • Deleting from: /lib/modules/2.6.32-431.17.1.el6.x86_64/extra/
  • Original module
    • No original module was found for this module on this kernel.
    • Use the dkms install command to reinstall any previous module version.

depmod…

DKMS: uninstall completed.


Deleting module version: 331.67
completely from the DKMS tree.

Done.
Cleanup : kernel-firmware-2.6.32-431.17.1.el6.noarch 6/8
Cleanup : kernel-headers-2.6.32-431.17.1.el6.x86_64 7/8
Cleanup : kernel-devel-2.6.32-431.17.1.el6.x86_64 8/8
Verifying : kernel-devel-2.6.32-431.21.1.el6.x86_64 1/8
Verifying : kernel-firmware-2.6.32-431.21.1.el6.noarch 2/8
Verifying : kernel-2.6.32-431.21.1.el6.x86_64 3/8
Verifying : kernel-headers-2.6.32-431.21.1.el6.x86_64 4/8
Verifying : kernel-headers-2.6.32-431.17.1.el6.x86_64 5/8
Verifying : kernel-2.6.32-431.17.1.el6.x86_64 6/8
Verifying : kernel-firmware-2.6.32-431.17.1.el6.noarch 7/8
Verifying : kernel-devel-2.6.32-431.17.1.el6.x86_64 8/8

Updated:
kernel.x86_64 0:2.6.32-431.21.1.el6 kernel-devel.x86_64 0:2.6.32-431.21.1.el6 kernel-firmware.noarch 0:2.6.32-431.21.1.el6
kernel-headers.x86_64 0:2.6.32-431.21.1.el6

Complete!

So it appears that DKMS kicked in during the cleanup phase and removed the kernel module for the current kernel level but it did not build one for the new level. Had I then rebooted I would have ended up with a hung system at the point it tries to go into graphical mode.

Running the folloing commands gave me the required module and the reboot went as expected

dkms add nvidia/331.67
dkms build nvidia/331.67 -k 2.6.32-431.21.1.el6.x86_64
dkms install nvidia/331.67 -k 2.6.32-431.21.1.el6.x86_64

However, shouldn’t this be done automatically by DKMS during the kernel update phase