[INFO]: Finished with code: 256 , [ERROR]: Install of driver component failed

Hi,

I am following the official installation guide @ https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#redhat-installation, but unfortunetely it failed with the following message:

Could anyone help to figure out what’s wrong and how to fix ?

Cmd:

sudo sh cuda_10.1.243_418.87.00_linux.run

Log:
/var/log/cuda-installer.log

1 [INFO]: Driver not installed.
2 [INFO]: Checking compiler version…
3 [INFO]: gcc location: /bin/gcc
4
5 [INFO]: gcc version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)
6
7 [INFO]: Initializing menu
8 [INFO]: Setup complete
9 [INFO]: Components to install:
10 [INFO]: Driver
11 [INFO]: 418.87.00
12 [INFO]: Executing NVIDIA-Linux-x86_64-418.87.00.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd 2>&1
13 [INFO]: Finished with code: 256
14 [ERROR]: Install of driver component failed.
15 [ERROR]: Install of 418.87.00 failed, quitting
~

~ > lspci | grep nvidia -i
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)

~ > uname -msr
Linux 3.10.0-957.el7.x86_64 x86_64

Thanks very much!

Kevin

The CUDA installer attempted to install the 418.87.00 driver and the driver installation failed. To find out why the driver installation failed, you’ll need to check the driver installer log.

That log would typically be at:

/var/log/nvidia-installer.log

3 Likes

The person above has literally posted the contents of the driver installer log file.

Similar problem when installing CUDA 10.2 with the runfile:

$ cat /var/log/cuda-installer.log
[INFO]: Driver not installed.
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc

[INFO]: gcc version: gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1) 

[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install: 
[INFO]: Driver
[INFO]: 440.33.01
[INFO]: Executing NVIDIA-Linux-x86_64-440.33.01.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd  2>&1
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 440.33.01 failed, quitting

No, that is the contents of the cuda installer log, not the driver installer log. The CUDA runfile installer calls the driver installer. When the driver installer runs, it creates a log file. I have already indicated a typical place where that log file may be found.

Here is an example of the /var/log directory contents on a Redhat/CentOS system after running the cuda_10.1.243_418.87.00_linux.run runfile installer:

$ ls /var/log
anaconda            cups              maillog-20191222      nvidia-uninstall.log  secure-20191230     wtmp
audit               dmesg             maillog-20191230      nvstats               secure-20200105     Xorg.0.log
boot.log            dmesg.old         maillog-20200105      openlmi-install.log   speech-dispatcher   Xorg.0.log.old
btmp                firewalld         messages              pcp                   spooler             Xorg.9.log
btmp-20200101       gdm               messages-20191216     pluto                 spooler-20191216    yum.log
chrony              glusterfs         messages-20191222     ppp                   spooler-20191222    yum.log-20170315
cron                grubby            messages-20191230     qemu-ga               spooler-20191230    yum.log-20180101
cron-20191216       httpd             messages-20200105     sa                    spooler-20200105    yum.log-20191021
cron-20191222       lastlog           ntpstats              samba                 sssd                yum.log-20200101
cron-20191230       libvirt           nvidia                secure                tallylog
cron-20200105       maillog           nvidia-installer.log  secure-20191216       tuned
cuda-installer.log  maillog-20191216  nvidia-mps            secure-20191222       wpa_supplicant.log
$

Note that there is a file called:

cuda-installer.log

and also a file called:

nvidia-installer.log

The cuda-installer.log file contains the log of the CUDA installer. The CUDA installer calls the driver installer. The driver installer logs detailed information in the nvidia-installer.log file. Not all of this information is recorded in the cuda-installer.log file, so when the driver installer fails, its necessary to inspect that file to get the most detailed information about why the driver install failed.

3 Likes

Ah, I see. Thank you for the clarification about

    /var/log/cuda-installer.log /var/log/nvidia-installer.log

files.

Based on the information from the latter, in my particular case the problem was due to installation while running the X server:

...
-> The file '/tmp/.X0-lock' exists and appears to contain the process ID '1596' of a runnning X server.
ERROR: You appear to be running an X server; please exit X before installing.  For further details, please see the section INSTALLING THE NVIDIA DRIVER in the README available on the Linux driver download page at www.nvidia.com.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

I was able to successfully install CUDA 10.2 by addressing the issue above.

Suggestion: It would be nice in the future if the descriptive information about errors from nvidia-installer.log would appear at the console output in case of installation failure, instead of error codes as it is now.

4 Likes

Hello,
i followed the instructions,
here is my log from /var/log/nvidia-installer.log
nvidia-installer log file ‘/var/log/nvidia-installer.log’
creation time: Mon Jan 20 22:27:52 2020
installer version: 440.33.01

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
./nvidia-installer
–ui=none
–no-questions
–accept-license
–disable-nouveau
–no-cc-version-check
–install-libglvnd

Using built-in stream user interface
-> Detected 8 CPUs online; setting concurrency level to 8.
-> Installing NVIDIA driver version 440.33.01.
-> Running distribution scripts
executing: ‘/usr/lib/nvidia/pre-install’…
-> done.
-> The distribution-provided pre-install script failed! Are you sure you want to continue? (Answer: Continue installation)
ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution’s documentation for details on how to correctly disable the Nouveau kernel driver.
WARNING: One or more modprobe configuration files to disable Nouveau are already present at: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf. Please be sure you have rebooted your system since these files were written. If you have rebooted, then Nouveau may be enabled for other reasons, such as being included in the system initial ramdisk or in your X configuration file. Please consult the NVIDIA driver README and your Linux distribution’s documentation for details on how to correctly disable the Nouveau kernel driver.
-> For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory. Would you like nvidia-installer to attempt to create this modprobe file for you? (Answer: Yes)
-> One or more modprobe configuration files to disable Nouveau have been written. For some distributions, this may be sufficient to disable Nouveau; other distributions may require modification of the initial ramdisk. Please reboot your system and attempt NVIDIA driver installation again. Note if you later wish to reenable Nouveau, you will need to delete these files: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.


I have a Acer Aspire V15 Nitro Black Edition(NVIDIA 960M and integrated Intel HD 530 Graphics,
Could be problem with that? I`m using Ubuntu 18.04.3 LTS 64-bit OS, and processor is Intel® Core™ i7-6700HQ CPU @ 2.60GHz


What am i supposed to do?

Michal

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau

1 Like

Thank you,
I wanted to be sure before doing anything like that.
After the instalation, should i remove the created file /etc/modprobe.d/blacklist-nouveau.conf??

Michal

no, don’t remove it, and there is no need to remove it

Hi I’m having below error in nvidia-installer.log . Please suggest what I need to do. I have Ubuntu 19.10 installed.

Using built-in stream user interface
-> Detected 12 CPUs online; setting concurrency level to 12.
WARNING: The NVIDIA Quadro 4000 GPU installed in this system is supported through the NVIDIA 390.xx legacy Linux graphics drivers. Please visit http://www.nvidia.com/object/unix.html for more information. The 440.33.01 NVIDIA Linux graphics driver will ignore this GPU.
WARNING: You do not appear to have an NVIDIA GPU supported by the 440.33.01 NVIDIA Linux graphics driver installed in this system. For further details, please see the appendix SUPPORTED NVIDIA GRAPHICS CHIPS in the README available on the Linux driver download page at www.nvidia.com.
ERROR: An NVIDIA kernel module ‘nvidia-drm’ appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module’s usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

I am also having an issue installing and would appreciate help fixing it. I have had this issue on multiple machines and i have tried from completely fresh install a few times as this toolkit doesnt seem to work.

Here are my log files starting with the nvidia-installer.log and followed by the cuda-installer.log

nvidia-installer log file ‘/var/log/nvidia-installer.log’
creation time: Fri Apr 24 16:56:44 2020
installer version: 440.33.01

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
./nvidia-installer
–ui=none
–no-questions
–accept-license
–disable-nouveau
–no-cc-version-check
–install-libglvnd

Using built-in stream user interface
-> Detected 12 CPUs online; setting concurrency level to 12.
ERROR: An NVIDIA kernel module ‘nvidia-drm’ appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module’s usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed. Please see the file ‘/var/log/nvidia-installer.log’ for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

[INFO]: Driver installation detected by command: apt list --installed | grep -e nvidia-driver-[0-9][0-9][0-9] -e nvidia-[0-9][0-9][0-9]
[INFO]: Cleaning up window
[INFO]: Complete
[INFO]: Checking compiler version…
[INFO]: gcc location: /usr/bin/gcc

[INFO]: gcc version: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install:
[INFO]: Driver
[INFO]: 440.33.01
[INFO]: Executing NVIDIA-Linux-x86_64-440.33.01.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd 2>&1
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 440.33.01 failed, quitting

1 Like

Hello folks,

I am failing at this stage as well. Please check my nvidia-installer error log below:

ERROR: An error occurred while performing the step: “Checking to see whether the nvidia kernel module was successfully built”. See /var/log/nvidia-installer.log for details.
-> The command cd ./kernel; /usr/bin/make -k -j12 NV_KERNEL_MODULES="nvidia" NV_EXCLUDE_KERNEL_MODULES="" SYSSRC="/lib/modules/5.3.0-1028-azure/build" SYSOUT="/lib/modules/5.3.0-1028-azure/build" failed with the following output:

make[1]: Entering directory ‘/usr/src/linux-headers-5.3.0-1028-azure’
CC [M] /tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv-pci-table.o
CC [M] /tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.o
In file included from /tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.c:21:0:
/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.c: In function ‘nvUvmInterfaceDeRegisterUvmOps’:
/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/common/inc/nv-linux.h:733:21: error: void value not ignored as it ought to be
int __ret = on_each_cpu(func, info, 1);
^
/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.c:991:5: note: in expansion of macro ‘NV_ON_EACH_CPU’
NV_ON_EACH_CPU(flush_top_half, NULL);
^~~~~~~~~~~~~~
scripts/Makefile.build:288: recipe for target ‘/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.o’ failed
make[2]: *** [/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel/nvidia/nv_uvm_interface.o] Error 1
make[2]: Target ‘__build’ not remade because of errors.
Makefile:1656: recipe for target ‘module/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel’ failed
make[1]: *** [module/tmp/selfgz15855/NVIDIA-Linux-x86_64-418.87.00/kernel] Error 2
make[1]: Target ‘modules’ not remade because of errors.
make[1]: Leaving directory ‘/usr/src/linux-headers-5.3.0-1028-azure’
Makefile:81: recipe for target ‘modules’ failed
make: *** [modules] Error 2
ERROR: The nvidia kernel module was not created.

Could anybody please suggest a solution for this ?
I am trying to install NVIDIA driver 418.87 with CUDA 10.1 but to no avail. Kindly assist. Thanks in advance.

Not sure if this post is still active, but I am seeing a similar issue and there is no entries in the nvidia-installer.log
The output /var/log/cuda-installer.log is while installing cuda 11-1 toolkit :
[INFO]: Driver not installed.
[INFO]: Checking compiler version…
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 8.3.0 (Debian 8.3.0-6)
[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install:
[INFO]: Driver
[INFO]: 455.23.05
[INFO]: Executing NVIDIA-Linux-x86_64-455.23.05.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd 2>&1
[INFO]: Finished with code: 3840
[ERROR]: Install of driver component failed.
[ERROR]: Install of 455.23.05 failed, quitting

I followed the pre-installation instructions as per :

The cuda toolkit I am trying to install is from:


The file is: cuda_11.1.0_455.23.05_linux.run

I did try to clean up previous of Nvidia driver 440 by running the uninstall. The output of the uninstall log is:-

vidia-installer log file ‘/var/log/nvidia-uninstall.log’
creation time: Mon Oct 12 02:21:53 2020
installer version: 440.95.01
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
nvidia-installer command line:
/usr/bin/nvidia-uninstall
Using: nvidia-installer ncurses v6 user interface
-> Detected 4 CPUs online; setting concurrency level to 4.
-> If you plan to no longer use the NVIDIA driver, you should make sure that no X screens are configured to use the
NVIDIA X driver in your X configuration file. If you used nvidia-xconfig to configure X, it may have created a bac
kup of your original configuration. Would you like to run nvidia-xconfig --restore-original-backup to attempt res
toration of the original X configuration file? (Answer: No)
-> Parsing log file:
-> done.
-> Validating previous installation:
-> The installed file ‘/usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0’ seems to have changed, but prelink -u failed
; unable to restore ‘/usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0’ to an un-prelinked state.
-> The previously installed symlink ‘/usr/lib/x86_64-linux-gnu/libOpenCL.so.1’ has target ‘libOpenCL.so.1.0.0’, but
it was installed with target ‘libOpenCL.so.1.0’. /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 will not be uninstalled
.
-> done.
WARNING: Your driver installation has been altered since it was initially installed; this may happen, for example,
if you have since installed the NVIDIA driver through a mechanism other than nvidia-installer (such as your distrib
ution’s native package management system). nvidia-installer will attempt to uninstall as best it can. Please see
the file ‘/var/log/nvidia-uninstall.log’ for details.
-> Uninstalling NVIDIA Accelerated Graphics Driver for Linux-x86_64 (1.0-4409501 (440.95.01)):
-> Failed to delete the directory ‘/usr/lib/x86_64-linux-gnu/vdpau’ (Directory not empty).
WARNING: Failed to delete some directories. See /var/log/nvidia-uninstall.log for details.
-> Unable to delete directories created by previous installation.
-> done.
-> Running depmod and ldconfig:
-> done.
-> Uninstallation of existing driver: NVIDIA Accelerated Graphics Driver for Linux-x86_64 (440.95.01) is complete.

I have not sure what is going on .

I experienced this exact issue on a squeaky-clean install of Ubuntu 16.04. As with the others, I didn’t know about the Nvidia installer log. It said that /tmp/.X0-lock file existed so it thought X was running. A simple:

sudo rm /tmp/.X0-lock

…solved the problem and the installer continued without a hitch.

thank you。solved my problem.