We have been running our K80 for a few months now when suddenly I can’t update my kernel. This is Centos 7.2:
[root@bourbaki 352.99] # yum reinstall nvidia-uvm-kmod
Loaded plugins: fastestmirror, langpacks, refresh-packagekit
Loading mirror speeds from cached hostfile
- base: centos.mirror.crucial.com.au
- epel: epel.mirror.digitalpacific.com.au
- extras: centos.mirror.crucial.com.au
- updates: centos.mirror.crucial.com.au
Resolving Dependencies
→ Running transaction check
—> Package nvidia-uvm-kmod.x86_64 1:352.99-3.el7 will be reinstalled
→ Finished Dependency Resolution
Dependencies Resolved
=============================================================================================================================================================
Package Arch Version Repository Size
Reinstalling:
nvidia-uvm-kmod x86_64 1:352.99-3.el7 cuda 123 k
Transaction Summary
Reinstall 1 Package
Total download size: 123 k
Installed size: 672 k
Is this ok [y/d/N]: y
Downloading packages:
nvidia-uvm-kmod-352.99-3.el7.x86_64.rpm | 123 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : 1:nvidia-uvm-kmod-352.99-3.el7.x86_64 1/1
Error! DKMS tree already contains: nvidia-uvm-352.99
You cannot add the same module/version combo more than once.
Kernel preparation unnecessary for this kernel. Skipping…
Building module:
cleaning build area…(bad exit status: 2)
make -j16 KERNELRELEASE=3.10.0-327.36.1.el7.x86_64 module KERNEL_UNAME=3.10.0-327.36.1.el7.x86_64 KBUILD_EXTMOD=/var/lib/dkms/nvidia-uvm/352.99/build RM_OUT_DIR=/var/lib/dkms/nvidia/352.99/source…(bad exit status: 2)
Error! Bad return status for module build on kernel: 3.10.0-327.36.1.el7.x86_64 (x86_64)
Consult /var/lib/dkms/nvidia-uvm/352.99/build/make.log for more information.
Kernel preparation unnecessary for this kernel. Skipping…
Building module:
cleaning build area…(bad exit status: 2)
make -j16 KERNELRELEASE=3.10.0-327.36.1.el7.x86_64 module KERNEL_UNAME=3.10.0-327.36.1.el7.x86_64 KBUILD_EXTMOD=/var/lib/dkms/nvidia-uvm/352.99/build RM_OUT_DIR=/var/lib/dkms/nvidia/352.99/source…(bad exit status: 2)
Error! Bad return status for module build on kernel: 3.10.0-327.36.1.el7.x86_64 (x86_64)
Consult /var/lib/dkms/nvidia-uvm/352.99/build/make.log for more information.
warning: %post(nvidia-uvm-kmod-1:352.99-3.el7.x86_64) scriptlet failed, exit status 10
Non-fatal POSTIN scriptlet failure in rpm package 1:nvidia-uvm-kmod-352.99-3.el7.x86_64
Verifying : 1:nvidia-uvm-kmod-352.99-3.el7.x86_64 1/1
Installed:
nvidia-uvm-kmod.x86_64 1:352.99-3.el7
Complete!
[root@bourbaki 352.99] # cat /var/lib/dkms/nvidia-uvm/352.99/build/make.log
DKMS make.log for nvidia-uvm-352.99 for kernel 3.10.0-327.36.1.el7.x86_64 (x86_64)
Wed Sep 21 08:40:54 AEST 2016
Makefile:179: /var/lib/dkms/nvidia/352.99/source/nvidia-modules-common.mk: No such file or directory
make: *** No rule to make target `/var/lib/dkms/nvidia/352.99/source/nvidia-modules-common.mk’. Stop.
Further version information:
[root@bourbaki 352.99] # rpm -qa | grep cuda | sort
cuda-7.5-18.x86_64
cuda-7-5-7.5-18.x86_64
cuda-command-line-tools-7-5-7.5-18.x86_64
cuda-core-7-5-7.5-18.x86_64
cuda-cublas-7-5-7.5-18.x86_64
cuda-cublas-dev-7-5-7.5-18.x86_64
cuda-cudart-7-5-7.5-18.x86_64
cuda-cudart-dev-7-5-7.5-18.x86_64
cuda-cufft-7-5-7.5-18.x86_64
cuda-cufft-dev-7-5-7.5-18.x86_64
cuda-curand-7-5-7.5-18.x86_64
cuda-curand-dev-7-5-7.5-18.x86_64
cuda-cusolver-7-5-7.5-18.x86_64
cuda-cusolver-dev-7-5-7.5-18.x86_64
cuda-cusparse-7-5-7.5-18.x86_64
cuda-cusparse-dev-7-5-7.5-18.x86_64
cuda-documentation-7-5-7.5-18.x86_64
cuda-driver-dev-7-5-7.5-18.x86_64
cuda-drivers-352.99-0.x86_64
cuda-license-7-5-7.5-18.x86_64
cuda-misc-headers-7-5-7.5-18.x86_64
cuda-npp-7-5-7.5-18.x86_64
cuda-npp-dev-7-5-7.5-18.x86_64
cuda-nvidia-kmod-common-352.99-0.x86_64
cuda-nvrtc-7-5-7.5-18.x86_64
cuda-nvrtc-dev-7-5-7.5-18.x86_64
cuda-repo-rhel7-7.5-18.x86_64
cuda-runtime-7-5-7.5-18.x86_64
cuda-samples-7-5-7.5-18.x86_64
cuda-toolkit-7-5-7.5-18.x86_64
cuda-visual-tools-7-5-7.5-18.x86_64
[root@bourbaki 352.99] # rpm -qa | grep kernel | sort
abrt-addon-kerneloops-2.1.11-36.el7.centos.x86_64
kernel-3.10.0-327.28.3.el7.x86_64
kernel-3.10.0-327.36.1.el7.x86_64
kernel-devel-3.10.0-327.28.3.el7.x86_64
kernel-devel-3.10.0-327.36.1.el7.x86_64
kernel-headers-3.10.0-327.36.1.el7.x86_64
kernel-tools-3.10.0-327.36.1.el7.x86_64
kernel-tools-libs-3.10.0-327.36.1.el7.x86_64
libreport-plugin-kerneloops-2.1.11-32.el7.centos.x86_64
texlive-l3kernel-svn29409.SVN_4469-38.el7.noarch
[root@bourbaki 352.99] # uname -a
Linux bourbaki.une.edu.au 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@bourbaki 352.99] # rpm -qa | grep nvidia | sort
cuda-nvidia-kmod-common-352.99-0.x86_64
nvidia-kmod-352.99-2.el7.x86_64
nvidia-uvm-kmod-352.99-3.el7.x86_64
xorg-x11-drv-nvidia-352.99-1.el7.x86_64
xorg-x11-drv-nvidia-devel-352.99-1.el7.x86_64
xorg-x11-drv-nvidia-gl-352.99-1.el7.x86_64
xorg-x11-drv-nvidia-libs-352.99-1.el7.x86_64