CUDA Installation + Ubuntu 16.04 Error

Hi,

i tried to install CUDA on my Notebook several ways. I own a Fujitsu Celsius H760 with a M2000M device. I used the README files and this document (Installation Guide Linux :: CUDA Toolkit Documentation).

sebastian@sebastian:~$ lspci | grep -i nvidia
01:00.0 3D controller: NVIDIA Corporation GM107GLM [Quadro M2000M] (rev a2)
>>> sebastian@sebastian:~$ uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
 BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial
sebastian@sebastian:~$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
root@sebastian:~# apt-get upgrade
Paketlisten werden gelesen... Fertig
Abhängigkeitsbaum wird aufgebaut.
Statusinformationen werden eingelesen.... Fertig
Paketaktualisierung (Upgrade) wird berechnet... Fertig
Die folgenden Pakete sind zurückgehalten worden:
 gnome-software gnome-software-common liboxideqt-qmlplugin liboxideqtcore0
liboxideqtquick0 oxideqt-codecs snapd ubuntu-core-launcher ubuntu-software
0 aktualisiert, 0 neu installiert, 0 zu entfernen und 9 nicht aktualisiert.
root@sebastian:~# apt-get install linux-headers-$(uname -r)
Paketlisten werden gelesen... Fertig
 Abhängigkeitsbaum wird aufgebaut.
 Statusinformationen werden eingelesen.... Fertig
 »linux-headers-4.4.0-62-generic« ist bereits die neuste Version
 (4.4.0-62.83).
0 aktualisiert, 0 neu installiert, 0 zu entfernen und 9 nicht aktualisiert.
root@sebastian:~# md5sum Downloads/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
 16b0946a3c99ca692c817fb7df57520c

→ it matches

root@sebastian:~# sudo dpkg -i
Downloads/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
 Vormals nicht ausgewähltes Paket cuda-repo-ubuntu1604 wird gewählt.
 (Lese Datenbank ... 207468 Dateien und Verzeichnisse sind derzeit
 installiert.)
Vorbereitung zum Entpacken von .../cuda-repo-ubuntu1604_8.0.44-1_amd64.deb ...
Entpacken von cuda-repo-ubuntu1604 (8.0.44-1) ...
cuda-repo-ubuntu1604 (8.0.44-1) wird eingerichtet ...
OK
root@sebastian:~# sudo apt-get update
OK:1 http://de.archive.ubuntu.com/ubuntu xenial InRelease
OK:2 http://de.archive.ubuntu.com/ubuntu xenial-updates InRelease
OK:3 http://de.archive.ubuntu.com/ubuntu xenial-backports InRelease
 Holen:4 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
Ign:5
http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64
InRelease
Holen:6
 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64
Release [564 B]
Holen:7
 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64
Release.gpg [819 B]
Holen:8
http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64
Packages [22,5 kB]
Es wurden 126 kB in 4 s geholt (28,7 kB/s).
Paketlisten werden gelesen... Fertig
W: Ungültiger »Date«-Eintrag in Release-Datei
/var/lib/apt/lists/partial/developer.download.nvidia.com_compute_cuda_repos_ubuntu1604_x86%5f64_Release

→ First Warning that the date-entry is not valid

root@sebastian:~# lsmod | grep nouveau
nouveau              1495040  0
mxm_wmi                16384  1 nouveau
wmi                    20480  2 mxm_wmi,nouveau
video                  40960  2 i915_bpo,nouveau
i2c_algo_bit           16384  2 i915_bpo,nouveau
 ttm                    94208  1 nouveau
 drm_kms_helper        155648  2 i915_bpo,nouveau
 drm                   364544  7 ttm,i915_bpo,drm_kms_helper,nouveau

→ i disabled it. edited the file with vi

cat /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
root@sebastian:~# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-4.4.0-62-generic
 W: Possible missing firmware /lib/firmware/i915/kbl_dmc_ver1.bin for module
 i915_bpo
root@sebastian:~# reboot

root@sebastian:~# lsmod | grep nouveau
root@sebastian:~#

root@sebastian:~# modprobe nouveau
root@sebastian:~#

→ no nouveau driver is loaded

now switch to console and stop lightdm

service ligthdm stop

→ Install via apt-get

apt-get install cuda

→ restart

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX X86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016
GCC version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1 16.04.4)
modprobe nvidia

→ no result

in the kernel log i found following message:

Feb  8 16:02:18 sebastian kernel: [    0.937105] nvidia: module license 'NVIDIA' taints kernel.
Feb  8 16:02:18 sebastian kernel: [    0.937107] Disabling lock debugging due to kernel taint
Feb  8 16:02:18 sebastian kernel: [    0.939462] pps_core: LinuxPPS API ver. 1 registered
Feb  8 16:02:18 sebastian kernel: [    0.939464] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
Feb  8 16:02:18 sebastian kernel: [    0.940092] nvidia: module verification failed: signature and/or required key missing - tainting kernel
Feb  8 16:02:18 sebastian kernel: [    0.940545] ahci 0000:00:17.0: version 3.0
Feb  8 16:02:18 sebastian kernel: [    0.940568] ahci 0000:00:17.0: can't derive routing for PCI INT A
Feb  8 16:02:18 sebastian kernel: [    0.940570] ahci 0000:00:17.0: PCI INT A: no GSI
Feb  8 16:02:18 sebastian kernel: [    0.940604] PTP clock support registered
Feb  8 16:02:18 sebastian kernel: [    0.940662] ahci 0000:00:17.0: SSS flag set, parallel bus scan disabled
Feb  8 16:02:18 sebastian kernel: [    0.940702] ahci 0000:00:17.0: AHCI 0001.0301 32 slots 3 ports 6 Gbps 0xe impl RAID mode
Feb  8 16:02:18 sebastian kernel: [    0.940704] ahci 0000:00:17.0: flags: 64bit ncq sntf ilck stag led clo only pio slum part ems sxs deso sadm sds apst 
Feb  8 16:02:18 sebastian kernel: [    0.943936] pcieport 0000:00:01.0: can't derive routing for PCI INT A
Feb  8 16:02:18 sebastian kernel: [    0.943938] nvidia 0000:01:00.0: PCI INT A: no GSI
Feb  8 16:02:18 sebastian kernel: [    0.943969] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
Feb  8 16:02:18 sebastian kernel: [    0.944093] nvidia-nvlink: Nvlink Core is being initialized, major device number 246
Feb  8 16:02:18 sebastian kernel: [    0.944136] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  367.57  Mon Oct  3 20:37:01 PDT 2016
Feb  8 16:02:18 sebastian kernel: [    2.497545] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 242
Feb  8 16:02:18 sebastian kernel: [    2.507958] pcieport 0000:00:01.0: can't derive routing for PCI INT B
Feb  8 16:02:18 sebastian kernel: [    2.509975] NVRM: failed to copy vbios to system memory.
Feb  8 16:02:18 sebastian kernel: [    2.510247] NVRM: RmInitAdapter failed! (0x30:0xffff:645)
Feb  8 16:02:18 sebastian kernel: [    2.510286] NVRM: rm_init_adapter failed for device bearing minor number 0

Full kernel.log and xorg.log can be found here: kern.log - Pastebin.com / xorg.log - Pastebin.com

Does anybody know how to fix it?

I cannot use my gpu via command line and xorg also fails…

Here is the output of the nvidia-bug-report.sh script. I could not run this when x is started, because x doesnt start and shows error message:

Fatal server error:
(EE) no screens found

After the start up linux doesn’t show me a login-screen. i have to switch to command line via Strg+Alt+F1 and login there.

The Output can be found here: nvidia-bug-report - Pastebin.com

nvidia-bug-report.log.gz (66.4 KB)