No Display When driver is Installed on Ubuntu 16.04 Gnome

I have been running Ubuntu 16.04 Gnome and had a big issue with GDM when ever I install drivers 375.66, 378.13, 381.22.

Once the driver is installed I have the message below repeated in the syslog every 5 seconds.
https://pastebin.com/UDVGiFJj

Some posts on the internet advise to make sure nouveau is blacklisted from the kernel modules when machine starts up. But on Ubuntu gnome nouveau isn’t present, so there isn’t anything to blacklist.

You might be able to reproduce this by:

  • installing a fresh copy of Ubuntu 16.04 Gnome,
  • add-apt-repository ppa:graphics-drivers/ppa
  • apt-get update
  • apt-get install nvidia-{driverNumber}

Any advise around how to resolve this would be great.

Thanks
James

You have an optimus laptop, so with Ubuntu you should install the package nvidia-prime and then use prime-select to choose either intel or nvidia gpu.

Hi Generix, its not a laptop. I have a P100 in a server, there is a battle between the intel on board graphics and the P100 which has no graphic output. I have seen prime-select this doesn’t do a great deal. prime-select query shows that the device is set to nvidia. I had moved from gnome to unity, this didn’t helped a great deal.

dmsg shows kernel taint message, and lightdm is struggling:

And light lightdm still has problems which a lot of other people seem to get:
Jun 19 15:36:46 sci-drive-ws systemd[1]: Starting Light Display Manager…
Jun 19 15:36:46 sci-drive-ws systemd[1]: Started Light Display Manager.
Jun 19 15:36:49 sci-drive-ws lightdm[1653]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared obj
Jun 19 15:36:49 sci-drive-ws lightdm[1653]: PAM adding faulty module: pam_kwallet.so
Jun 19 15:36:49 sci-drive-ws lightdm[1653]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared o
Jun 19 15:36:49 sci-drive-ws lightdm[1653]: PAM adding faulty module: pam_kwallet5.so
Jun 19 15:36:49 sci-drive-ws lightdm[1653]: pam_unix(lightdm-greeter:session): session opened for user lightdm by (uid=0)
Jun 19 15:36:49 sci-drive-ws lightdm[1711]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared obj
Jun 19 15:36:49 sci-drive-ws lightdm[1711]: PAM adding faulty module: pam_kwallet.so
Jun 19 15:36:49 sci-drive-ws lightdm[1711]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared o
Jun 19 15:36:49 sci-drive-ws lightdm[1711]: PAM adding faulty module: pam_kwallet5.so
Jun 19 15:36:49 sci-drive-ws lightdm[1711]: pam_succeed_if(lightdm:auth): requirement “user ingroup nopasswdlogin” not met by user "sc
Jun 19 15:37:04 sci-drive-ws lightdm[1711]: pam_unix(lightdm:session): session opened for user sciadmin by (uid=0)
Jun 19 15:37:08 sci-drive-ws lightdm[1444]: ** (lightdm:1444): CRITICAL **: session_get_login1_session_id: assertion ‘session != NULL’
Jun 19 15:37:10 sci-drive-ws lightdm[1444]: /etc/modprobe.d is not a file
Jun 19 15:37:10 sci-drive-ws lightdm[1444]: /etc/modprobe.d is not a file
Jun 19 15:37:10 sci-drive-ws lightdm[1444]: /etc/modprobe.d is not a file
Jun 19 15:37:10 sci-drive-ws lightdm[1444]: /etc/modprobe.d is not a file
Jun 19 15:37:10 sci-drive-ws lightdm[1444]: /etc/modprobe.d is not a file

Solution? Semi-Solved.
This seems to be a common problem. the driver doesn’t work well with multiple graphics cards and kernel 4.8. I have managed to work around the problems with Unity and gnome by installing ‘apt-get install lubuntu-core’, this lets me login with LXDE. Ubuntu server amazing, ubuntu desktop ghastly.

Ok, this clarifies your setup. Please run nvidia-bug-report.sh and attach output file. Most likely just needs some xorg.conf tweaking.

Added file
nvidia-bug-report.log (515 KB)

Problem is, you don’t have an Intel iGFX, it’s an ‘ASPEED Graphics’. It…works but without 3d accel/compositing. So you will only be able to use something like xfce or openbox.
You will have to set up a minimal xorg.conf for it:

Section "Module"
  Disable "glx"
EndSection

Section "Device"
  Identifier "Device0"
  VendorName "ASPEED"
  BusId "PCI:8:0:0"
EndSection

Section "Screen"
  Identifier "Screen0"
  Device "Device0"
  Monitor "Monitor0"
  DefaultDepth 24
  SubSection "Display"
    Depth 24
    EndSubSection
  EndSection

Thanks for the advice, I’l give these settings a go.

I’m still confused why this causes a problem, I only get issue once the nvidia driver/package is installed. I can remove the nvidia package and lightdm starts working correctly and I’m able to login.

By installing the nvidia-drivers, the gl(x) implementation gets switched from mesa to nvidia. When loading glx without nvidia-drivers, mesa probably uses software-rendering. Using nvidia glx on a non-nvidia device obviously doesn’t work.
With ubuntu, you can try to switch the gl implementation back to mesa while the nvidia-driver is installed:

# Configure for mesa GLX via update-alternatives
   update-alternatives --set i386-linux-gnu_gl_conf /usr/lib/i386-linux-gnu/mesa/ld.so.conf
   update-alternatives --set i386-linux-gnu_egl_conf /usr/lib/i386-linux-gnu/mesa-egl/ld.so.conf
   update-alternatives --set x86_64-linux-gnu_gl_conf /usr/lib/x86_64-linux-gnu/mesa/ld.so.conf
   update-alternatives --set x86_64-linux-gnu_egl_conf /usr/lib/x86_64-linux-gnu/mesa-egl/ld.so.conf

And then delete the ‘Disable glx’ line in xorg.conf.

A bit more info on the ASPEED gfx:
http://www.phoronix.com/scan.php?page=news_item&px=MTA5MTY
So without glx software rendering, more specific settings for the xorg.conf device section should be

Driver         "modesetting"
    Option         "AccelMethod" "none"

Using software rendering on a compute rig seems to me a waste of precious cpu cycles just for a bling-bling desktop.

Unfortunately this hasn’t resolved the issue, I just re-ran the debug and which shows my upgraded kernel and also the section that scans the dmesg/kernel log files for NVIDIA kernel messages is slightly different.

Just to confirm my xorg config is:

Section "Module"
  Disable "glx"
EndSection

Section "Device"
  Identifier "Device0"
  VendorName "ASPEED"
  BusId "PCI:8:0:0"
  Driver         "modesetting"
  Option         "AccelMethod" "none"
EndSection

Section "Screen"
  Identifier "Screen0"
  Device "Device0"
  Monitor "Monitor0"
  DefaultDepth 24
  SubSection "Display"
    Depth 24
  EndSubSection
EndSection

nvidia-bug-report.log.gz.2.gz (214 KB)

The logs tell me that you now have a running xserver without 2d/3d acceleration, which is fine. Barebones. Now you can use a displaymanager like xdm/lxdm or anything like that which does not depend on glx. Same goes for window manager; maybe xfce?
What are you trying to run on it? Keep in mind, the ASPEED is a simple 2D framebuffer device. Anything else would need software emulation, like said.

Maybe take a different route. If you want a full featured desktop on your hardware without much configuration, just buy a cheap GT710. Hook up your monitor to this, disable the ASPEED in bios, generate a xorg.conf using nvidia-xconfig and finish.

I do have a spare NVS-295

I did try this, but I took it out because I thought I was compatibility issues with the legacy 295 driver and the more modern P100 driver.

I’l take another look at configuring this in the xorg.conf

Use can’t use that old thing, as you said, you can’t have the legacy driver and the P100 driver installed at the same time.

I’m going to look into getting a GT710, for now I will use the onboard graphics. I could get the amended xorg.conf file to work I just got a low-res system, I applied the /etc/xorg.conf.failsafe for the time being.

Thanks for your help.