Hello,
I’m trying to install the NVidia driver for vGPU on RHEL7 but the vGPU is apparently not recognized by the driver. I have the same error with the following drivers:
- NVIDIA-Linux-x86_64-384.111-grid.run
- NVIDIA-Linux-x86_64-390.42-grid.run
Nouveau is blacklisted and disabled:
[root@testgpu ~]# lsmod | grep -i nouveau
[root@testgpu ~]#
The vGPU is seen by the kernel:
[root@testgpu ~]# lspci | grep -i nvidia
02:00.0 VGA compatible controller: NVIDIA Corporation GP100GL (rev a1)
[root@testgpu ~]#
Kernel used is:
[root@testgpu ~]# uname -r
3.10.0-514.26.2.el7.x86_64
[root@testgpu ~]#
Kernel headers and dev packages are installed:
[root@testgpu ~]# rpm -qa | grep "^kernel.*$(uname -r)"
kernel-devel-3.10.0-514.26.2.el7.x86_64
kernel-tools-libs-3.10.0-514.26.2.el7.x86_64
kernel-tools-3.10.0-514.26.2.el7.x86_64
kernel-3.10.0-514.26.2.el7.x86_64
kernel-headers-3.10.0-514.26.2.el7.x86_64
[root@testgpu ~]#
dkms is also installed.
I just ran the installer without any option, said yes to dkms and no to 32bits. Here’s the content of the /var/log/nvidia-installer.log file:
[root@testgpu ~]# cat /var/log/nvidia-installer.log
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Wed Apr 18 10:19:47 2018
installer version: 384.111
PATH: /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/puppetlabs/bin:/root/bin
nvidia-installer command line:
./nvidia-installer
Unable to load: nvidia-installer ncurses v6 user interface
Using: nvidia-installer ncurses user interface
-> Detected 1 CPUs online; setting concurrency level to 1.
-> Tagging shared libraries with chcon -t textrel_shlib_t.
-> License accepted.
-> Installing NVIDIA driver version 384.111.
-> Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. (Answer: Yes)
-> Installing both new and classic TLS OpenGL libraries.
-> Installing both new and classic TLS 32bit OpenGL libraries.
-> Install NVIDIA's 32-bit compatibility libraries? (Answer: No)
-> Will install GLVND GLX client libraries.
-> Will install GLVND EGL client libraries.
-> Skipping GLX non-GLVND file: "libGL.so.384.111"
-> Skipping GLX non-GLVND file: "libGL.so.1"
-> Skipping GLX non-GLVND file: "libGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.384.111"
-> Skipping EGL non-GLVND file: "libEGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.1"
Looking for install checker script at ./libglvnd_install_checker/check-libglvnd-install.sh
executing: '/bin/sh ./libglvnd_install_checker/check-libglvnd-install.sh'...
Checking for libglvnd installation.
Checking libGLdispatch...
Can't load library libGLdispatch.so.0: libGLdispatch.so.0: cannot open shared object file: No such file or directory
Will install libglvnd libraries.
Will install libEGL vendor library config file to /usr/share/glvnd/egl_vendor.d
-> Searching for conflicting files:
-> done.
-> Installing 'NVIDIA Accelerated Graphics Driver for Linux-x86_64' (384.111):
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-glcore.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/extensions/libglx.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-tls.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/tls/libnvidia-tls.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLX_nvidia.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libOpenGL.so.0'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLESv1_CM.so.1'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLESv2.so.2'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLdispatch.so.0'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLX.so.0'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGL.so.1.0.0'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libEGL.so.1'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/drivers/nvidia_drv.so'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/xorg/modules/libnvidia-wfb.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-gtk2.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-gtk3.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-cfg.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-ml.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/nvidia/gridd/libFlxCore64.so.2015.03'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/nvidia/gridd/libFlxComm64.so.2015.03'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/vdpau/libvdpau_nvidia.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libcuda.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-opencl.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libOpenCL.so.1.0.0'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-fatbinaryloader.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-ptxjitcompiler.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvcuvid.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-encode.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-ifr.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-fbc.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-compiler.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-eglcore.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-glsi.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libEGL_nvidia.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLESv2_nvidia.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libGLESv1_CM_nvidia.so.384.111'...
executing: '/usr/bin/chcon -t textrel_shlib_t /usr/lib64/libnvidia-egl-wayland.so.1.0.1'...
executing: '/usr/sbin/ldconfig'...
-> done.
-> Driver file installation is complete.
-> Installing DKMS kernel module:
-> done.
ERROR: Unable to load the 'nvidia-drm' kernel module.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
[root@testgpu ~]#
The modules are installed successfully:
[root@testgpu ~]# find /usr/lib/modules -name "*.ko" | grep -i nvidia
/usr/lib/modules/3.10.0-514.el7.x86_64/weak-updates/nvidia-uvm.ko
/usr/lib/modules/3.10.0-514.el7.x86_64/weak-updates/nvidia-modeset.ko
/usr/lib/modules/3.10.0-514.el7.x86_64/weak-updates/nvidia-drm.ko
/usr/lib/modules/3.10.0-514.26.2.el7.x86_64/extra/nvidia.ko
/usr/lib/modules/3.10.0-514.26.2.el7.x86_64/extra/nvidia-uvm.ko
/usr/lib/modules/3.10.0-514.26.2.el7.x86_64/extra/nvidia-modeset.ko
/usr/lib/modules/3.10.0-514.26.2.el7.x86_64/extra/nvidia-drm.ko
[root@testgpu ~]#
Kernel messages are:
[root@testgpu ~]# dmesg | grep -i nvidia
[ 438.635687] nvidia: loading out-of-tree module taints kernel.
[ 438.635692] nvidia: module license 'NVIDIA' taints kernel.
[ 438.640539] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 438.644476] nvidia-nvlink: Nvlink Core is being initialized, major device number 246
[ 438.644756] nvidia 0000:02:00.0: enabling device (0100 -> 0103)
[ 438.646329] NVRM: The NVIDIA GPU 0000:02:00.0 (PCI ID: 10de:15f8)
NVRM: NVIDIA 384.111 driver release.
NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
NVRM: specific graphics driver download page at www.nvidia.com.
[ 438.646530] nvidia: probe of 0000:02:00.0 failed with error -1
[ 438.646542] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 438.646543] NVRM: None of the NVIDIA graphics adapters were initialized!
[ 438.646596] nvidia-nvlink: Unregistered the Nvlink Core, major device number 246
[root@testgpu ~]#
When trying with modprobe:
[root@testgpu ~]# modprobe -v nvidia
insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/extra/nvidia.ko
modprobe: ERROR: could not insert 'nvidia': No such device
[root@testgpu ~]#
The physical GPU is a Tesla P100, so I also tried with the driver for this card but without success.
Can anyone tell me if (and what) I’m doing wrong ?
Thanks.