I carefully went through all the prerequisites on this Ubuntu 18.04 install with a GTX1050 and a 7700k CPU. No issues following the pre-installation steps to the letter.
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/#pre-installation-actions
Then following the instructions in section 4.1.5.2 for the Runfile Installer, I’ve run into trouble.
https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#ubuntu-x86_64-run
The runfile terminates with error, and I inspected the attached /var/log/cuda-installer.log for issues. There seem to be several:
- The first is the warning below about the Nouveau driver. I don’t understand why this is here as an “lsmod” shows that it is not loaded.
[INFO]: WARNING: One or more modprobe configuration files to disable Nouveau are already present at: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf. Please be sure you have rebooted your system since these files were written. If you have rebooted, then Nouveau may be enabled for other reasons, such as being included in the system initial ramdisk or in your X configuration file. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
- Then this error.
[INFO]: ERROR: Unable to load the kernel module 'nvidia-modeset.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA GPU(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release.
- Which is enough to cause…
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 418.67 failed, quitting
I’m not sure if this is informative, but here is a snippet of dmesg from one of the level-3 boots while attempting to use the Runfile Installer.
[ 54.760499] nvidia-uvm: Unloaded the UVM driver in 8 mode
[ 54.784146] [drm] [nvidia-drm] [GPU ID 0x00000100] Unloading driver
[ 54.809272] nvidia-modeset: Unloading
[ 54.841349] nvidia-nvlink: Unregistered the Nvlink Core, major device number 236
[ 69.063004] VFIO - User Level meta-driver version: 0.3
[ 69.086094] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 69.086362] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
[ 69.185755] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 418.67 Sat Apr 6 03:07:24 CDT 2019
[ 69.187775] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 418.67 Sat Apr 6 02:43:09 CDT 2019
[ 69.189837] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 510
[ 69.189906] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[ 69.189907] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0
I went through this a couple of times. Then finally removed the Nouveau blacklisting and switched back to my Cuda-less configuration without issue. This feels like some kind of error on my behalf, but its not obvious to me. I’m not sure if this matters, but both in GUI and level-3 mode, I have my monitor plugged the GTX1050 GPU, not the motherboard’s output from the 7700k. It seems like this method should work. I’m not adverse to using the Debian Installer method, if that is more likely to succeed. I’d rather understand what is going on here. Thanks in advance for any suggestions.
cuda-installer.log (20.8 KB)