prime-select nvidia > log off doesn't work, but reboot does on 375, Quadro M1000M, Dell Precision 550

If I run prime-select nvidia, then log off, I get a black screen with a blinking white cursor in the top left. sometimes, I get a dialog saying graphics, input etc. devices weren’t found, and I could configure them. From there, I can’t get anywhere, and have to press the power button (which initiates, and completes a shutdown). Upon restarting, prime-select query returns nvidia. If I do prime-select intel, then logging out makes it stick, as expected.

I also noticed that if I install bumblebee, then rebooting doesn’t work either, and optirun etc. doesn’t work at all. If I purge bumblebee, then prime-select nvidia works after a reboot.

A couple of months back, optirun worked as expected.

I’ve run the following:

apt-get purge '.*nvidia.*' 'bumblebee.*' '.*primus.'   
 apt-get install nvidia-378 nvidia-prime nvidia-settings
 apt-get install cuda

I’d appreciate any help in tracking down the issue. System details are:

  • Dell Precision 5510, with intel i915, and Quadro M1000M card with the quadro feeding through the intel.
  • Ubuntu 16.10
  • kernel 4.10.6

Regards,
Ashic.

I have the same issue. I’ve tried both nvidia-378 and nvidia-375. Reboot works, log-off and log-in doesn’t. I don’t get a blinking white cursor or a dialog though, I simply return to the login screen.

Other details:

  • ASUS FX553VD, with Intel i915 (7700HQ), GTX1050
  • Ubuntu GNOME 16.04
  • Kernel 4.10.0

Logoff isn’t enough if you’re using Gnome with GDM as GDM runs its own xserver on vt7 and spawns a second xserver for the user session on vt2 or whichever is free. A workaround might be logoff and then zap the GDM xserver (Ctrl-Alt-BkSpc)

I tried zapping the X server after logging off. The X server didn’t start again. The screen was a regular terminal with the kernel boot messages.

Run nvidia-bug-report.sh then and attach output to your post.

FWIW, I booted into text mode(disabling GDM) and ran startx. I was able to use my GNOME desktop normally. I then ran ‘prime-select nvidia’ and logged out back into the tty. Running startx again failed.

Does this forum support attachments? I can’t find the attach button…

Attach to existing post next to edit.

I’m unable to upload the log file. It keeps getting flagged as infected. I’ve upload it to https://gist.github.com/anonymous/a26c796369c5581c3cc932da8c64bb45

Looks like an ACPI problem, the dGPU is refusing to power off. Please use acpidump and attach output, maybe there’s a workaround.
BTW your usb drive is damaged.

I’ve uploaded the output of acpidump to https://gist.githubusercontent.com/anonymous/8242c42c24856ebce458fb064f6c7c09/raw/6b2224f9f7d8774cda9933caa6c1183849dfaa64/acpidump.txt

What USB drive? I do not have any USB drive plugged in.

From your dmesg:

[   69.991208] usb-storage 1-2:1.0: USB Mass Storage device detected
[   69.991316] scsi host3: usb-storage 1-2:1.0
[   69.991370] usbcore: registered new interface driver usb-storage
[   70.002181] usbcore: registered new interface driver uas
[   71.013683] scsi 3:0:0:0: Direct-Access     Seagate  FreeAgent GoFlex 0148 PQ: 0 ANSI: 4
[   71.014066] sd 3:0:0:0: Attached scsi generic sg1 type 0
[   71.014218] sd 3:0:0:0: [sdb] 976773167 512-byte logical blocks: (500 GB/466 GiB)
[   71.014585] sd 3:0:0:0: [sdb] Write Protect is off
[   71.014586] sd 3:0:0:0: [sdb] Mode Sense: 1c 00 00 00
[   71.014938] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   71.081276]  sdb: sdb1
[   71.082562] sd 3:0:0:0: [sdb] Attached SCSI disk
[   71.332251] sd 3:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[   71.332253] sd 3:0:0:0: [sdb] tag#0 Sense Key : Hardware Error [current] [descriptor] 
[   71.332254] sd 3:0:0:0: [sdb] tag#0 Add. Sense: No additional sense information
[   71.332255] sd 3:0:0:0: [sdb] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 00 e5 00
[   71.551738] sd 3:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[   71.551739] sd 3:0:0:0: [sdb] tag#0 Sense Key : Hardware Error [current] [descriptor] 
[   71.551740] sd 3:0:0:0: [sdb] tag#0 Add. Sense: No additional sense information
[   71.551741] sd 3:0:0:0: [sdb] tag#0 CDB: ATA command pass through(12)/Blank a1 06 20 da 00 00 4f c2 00 b0 00 00

About ACPI: looking at the dump, you should add to your kernel parameters
acpi_osi=! acpi_osi=“Windows 2009”
Maybe other fixes have to be applied to get backlight/function keys working again.

Oh yeah! I had a portable hard drive plugged in earlier. Thanks!

I added those two kernel parameters, but I still can’t switch GPUs without rebooting. I boot into GNOME, run prime-select to switch GPU, log off, and press Ctrl+Alt+Backspace. GDM doesn’t start again, I’m stuck in the terminal and the screen flickers.

Can you post a new nvidia-bug-report.sh?

Sure. Here: https://gist.github.com/anonymous/b56b42741cd9ccf01f01040fa6af1d80

This is without GDM btw, I’m booting into text mode, running startx, running prime-select after GNOME starts, pressing Ctrl+Alt+Backspace, then running startx again.

If you disabled the displaymanager you will have to start gpumanager manually (systemctl start gpumanager). Without it, there will be no actual gpu switching. So you will have to generate new logs and while you’re at it, also post /var/log/gpumanager.log

Oh! I ran systemctl start gpu-manager this time.
Here’s the nvidia-bug report: https://gist.github.com/anonymous/186a03c414f795e4d62916c00087cdad

And the gpu-manager.log file:

log_file: /var/log/gpu-manager.log
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can't access /run/u-d-c-fglrx-was-loaded file
Looking for fglrx modules in /lib/modules/4.10.0-15-generic/updates/dkms
Looking for nvidia modules in /lib/modules/4.10.0-15-generic/updates/dkms
Found nvidia module: nvidia_375_modeset.ko
Is nvidia loaded? yes
Was nvidia unloaded? no
Is nvidia blacklisted? yes
Is fglrx loaded? no
Was fglrx unloaded? no
Is fglrx blacklisted? no
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is fglrx kernel module available? no
Is nvidia kernel module available? yes
Vendor/Device Id: 8086:591b
BusID "PCI:0@0:2:0"
Is boot vga? yes
Vendor/Device Id: 10de:1c8d
BusID "PCI:1@0:0:0"
Is boot vga? no
Skipping "/dev/dri/card0", driven by "i915"
Skipping "/dev/dri/card1", driven by "nvidia-drm"
Skipping "/dev/dri/card0", driven by "i915"
Skipping "/dev/dri/card1", driven by "nvidia-drm"
Skipping "/dev/dri/card0", driven by "i915"
Skipping "/dev/dri/card1", driven by "nvidia-drm"
Found "/dev/dri/card0", driven by "i915"
output 0:
	card0-eDP-1
Number of connected outputs for /dev/dri/card0: 1
Does it require offloading? yes
last cards number = 1
Has amd? no
Has intel? yes
Has nvidia? yes
How many cards? 2
The number of cards has changed!
Has the system changed? Yes
main_arch_path x86_64-linux-gnu, other_arch_path i386-linux-gnu
Current alternative: /usr/lib/x86_64-linux-gnu/mesa/ld.so.conf
Current core alternative: (null)
Current egl alternative: /usr/lib/x86_64-linux-gnu/mesa-egl/ld.so.conf
Is nvidia enabled? no
Is nvidia egl enabled? no
Is fglrx enabled? no
Is mesa enabled? yes
Is mesa egl enabled? yes
Is pxpress enabled? no
Is prime enabled? no
Is prime egl enabled? no
Is nvidia available? yes
Is nvidia egl available? no
Is fglrx available? no
Is fglrx-core available? no
Is mesa available? yes
Is mesa egl available? yes
Is pxpress available? no
Is prime available? yes
Is prime egl available? no
System configuration has changed
Intel IGP detected
Intel hybrid system
Nvidia driver version 375.66 detected
/sys/class/dmi/id/product_version="1.0"
/sys/class/dmi/id/product_name="GL553VD"
1st try: bbswitch without quirks
Loading bbswitch with "load_state=-1 unload_state=1" parameters
Selecting prime
/usr/bin/update-alternatives --set x86_64-linux-gnu_gl_conf /usr/lib/nvidia-375-prime/ld.so.conf
update-alternatives status 0
Calling ldconfig
ldconfig status 0
/usr/bin/update-alternatives --set i386-linux-gnu_gl_conf /usr/lib/nvidia-375-prime/alt_ld.so.conf
update-alternatives status 0
Calling ldconfig
ldconfig status 0
/usr/bin/update-alternatives --set x86_64-linux-gnu_egl_conf /usr/lib/nvidia-375-prime/ld.so.conf
update-alternatives status 0
Calling ldconfig
ldconfig status 0
/usr/bin/update-alternatives --set i386-linux-gnu_egl_conf /usr/lib/nvidia-375-prime/alt_ld.so.conf
update-alternatives status 0
Calling ldconfig
ldconfig status 0
Removing xorg.conf. Path: /etc/X11/xorg.conf
Powering off the discrete card
Disabling persistence mode
Unloading nvidia-uvm with "no" parameters
Unloading nvidia-drm with "no" parameters
Unloading nvidia-modeset with "no" parameters
Unloading nvidia with "no" parameters

I don’t think the gpu-manager service is actually running, look at the output of systemctl status gpu-manager:

~$ systemctl status gpu-manager
● gpu-manager.service - Detect the available GPUs and deal with any system changes
   Loaded: loaded (/lib/systemd/system/gpu-manager.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Wed 2017-05-31 01:05:38 IST; 5min ago
  Process: 1473 ExecStart=/usr/bin/gpu-manager --log /var/log/gpu-manager.log (code=exited, status=0/SUCCESS)
 Main PID: 1473 (code=exited, status=0/SUCCESS)

May 31 01:05:36 medakk-pc gpu-manager[1473]: /etc/modprobe.d is not a file
May 31 01:05:36 medakk-pc gpu-manager[1473]: update-alternatives: error: no alternatives for x86_64-linux-gnu_gfxcore_conf
May 31 01:05:37 medakk-pc gpu-manager[1473]: update-alternatives: using /usr/lib/nvidia-375-prime/ld.so.conf to provide /etc/ld.so.conf.d/x86_64-linux-gnu_GL.conf (x86_64-linux-gnu_gl_conf
May 31 01:05:37 medakk-pc gpu-manager[1473]: update-alternatives: using /usr/lib/nvidia-375-prime/alt_ld.so.conf to provide /etc/ld.so.conf.d/i386-linux-gnu_GL.conf (i386-linux-gnu_gl_conf
May 31 01:05:37 medakk-pc gpu-manager[1473]: update-alternatives: using /usr/lib/nvidia-375-prime/ld.so.conf to provide /etc/ld.so.conf.d/x86_64-linux-gnu_EGL.conf (x86_64-linux-gnu_egl_co
May 31 01:05:37 medakk-pc gpu-manager[1473]: update-alternatives: using /usr/lib/nvidia-375-prime/alt_ld.so.conf to provide /etc/ld.so.conf.d/i386-linux-gnu_EGL.conf (i386-linux-gnu_egl_co
May 31 01:05:38 medakk-pc gpu-manager[1473]: Persistence mode is already Disabled for GPU 0000:01:00.0.
May 31 01:05:38 medakk-pc gpu-manager[1473]: All done.
May 31 01:05:38 medakk-pc gpu-manager[1473]: rmmod: ERROR: Module nvidia_uvm is not currently loaded
May 31 01:05:38 medakk-pc systemd[1]: Started Detect the available GPUs and deal with any system changes.

Gpumanager is a one-shot service, always running once before the displaymanager starts.
so the sequence would be
boot
gpumanager
startx
prime-select
stopx
gpumanager
startx
nvidia-bug-report.sh

maybe twice for nvidia->intel and intel->nvidia
or just enable displaymanager again, boot to login, switch to console, then
prime-select
systemctl restart display-manager
nvidia-bug-report.sh

Your last logs looked good but I don’t know from which state they were.

This is odd: I’m able to switch GPUs without rebooting when I use “systemctl restart display-manager”! I had previously tried Ctrl+Alt+Backspace and “/etc/init.d/gdm3 restart” to restart the display manager, both of which didn’t work.

So, here’s the exact steps I’ve followed:

  1. Boot with acpi_osi=! acpi_osi=“Windows 2009”
  2. Login normally
  3. Use prime-select to switch GPU
  4. Log off
  5. Switch to a virtual console
  6. Run systemctl restart display-manager
  7. Log in again

Everything seems to be working fine. Any idea why zapping the X server didn’t work?

Unfortunately, the touchpad doesn’t work now.