Problems with OpenSuse and my ASUS ROG GL752VW-DH71

I recently got a new laptop to replace my ancient Asus A7k. Its a lovely ASUS ROG GL752VW-DH71.

I’ve tried to install Leap 42.2 several times without much success. I understand that it is an optimus platform. Accordingly I’ve following the bumblee instructions (https://en.opensuse.org/SDB:NVIDIA_Bumblebee). I’ve tried both btrfs and ext4 filesystems for both home and root partitions. Neither has changed the end result.

During my latest install attempt, I got to the point where the NVIDA drivers (easy way) were installing but encountered soft-locks on the CPUs (this has been typical of every attempt thus far…very annoying) that wouldn’t go away for the last two rpms. I rebooted, and wasn’t able to login with my stated user id, and default kernel (hung at text login). Zo I thought I’d resume using debug kernel. I finished installing the rest of the rpms, and finished the bumblebee config steps (Perhaps not the best idea, but it was late, I was frustrated, and had little sleep). Now for debug and regular kernel and regular safemode, I boot, get to text login. This looks alive long enough for me to type 4 characters (i.e. root) as login credentials, then it hangs.

The only way I can login to linux at this point is debug, safe mode. By some small miracle I get a graphical login and can get into plasma.

When I run optirun --status, it tells me that X is inactive and discrete video is likely on. optirun glxgears tells me that secondary GPU driver could not be loaded.

Any suggestions would be helpful (please tell me I don’t need to reinstall again!). I’ve also tried ‘intel_pstate=disable’ and ‘nomodeset’ or ‘nouveau.modprobe=0’ after doing research within the forums. Unfortunately this hasn’t moved me any further.

I know I should be including various diagnostics, but its been awhile since I had to build drivers so bear with me. Let me know what is needed so I don’t waste anyones time

Cheers
gto

Do you have a setting Discrete/Intel/Hybrid in your BIOS? If so, what’s it set to?
For more info, run nvidia-bug-report.sh and attach output file to your post.

No setting in my bios. Bios is America Megatrends V2.0.1034

Unfortunately no luck while running in debug -recoverymode

nvidia-bug-report.sh --safe-mode

nvidia-bug-report.sh will now collect information about your
system and create the file ‘nvidia-bug-report.log.gz’ in the current
directory. It may take several seconds to run. In some
cases, it may hang trying to capture data generated dynamically
by the Linux kernel and/or the NVIDIA kernel module. While
the bug report log file will be incomplete if this happens, it
may still contain enough data to diagnose your problem.

Please include the ‘nvidia-bug-report.log.gz’ log file when reporting
your bug via the NVIDIA Linux forum (see devtalk.nvidia.com)
or by sending email to ‘linux-bugs@nvidia.com’.

Running nvidia-bug-report.sh…ls: cannot access ‘/proc/driver/nvidia/./gpus/’: No such file or directory
complete.

I’ll see if I can login in one of the other kernel modules
nvidia-bug-report.log.gz (60.6 KB)

ok, so I can’t get past a login screen under any of the default kernels.

When load just the default kernel, I get hung at the login prompt

When I run kernel-default-recovery, I get hung at the login prompt but at least I can see a bit more info.

in one boot case I noted:
bbswitch: successfully loaded: discrete card 0000:01:00:0 is off

In another I noted that Linux 4.4.62.-18.6-default xhci-hcd was the
kernel loaded, following
NVIDIA: module license “NVIDIA” taints kernel

While I would hope that a simple change of bbswitch configuration would fix things, I’ll post a question pre-emptively in the case that I need to reinstall everything:
->is there any way to prevent CPU softlocks from occurring during the installation of the NVIDIA drivers?

As a first note, bumblebee setups are not supported in this forum.
Unluckily, your logs were not useful as your dmesg was flooded with battery events due to debug mode.
But from you description, you’re most likely hit by this:
https://github.com/Bumblebee-Project/Bumblebee/issues/764#issuecomment-234494238
Please us acpidump and attach output to your post and I’ll have a look at it.

Ah, sorry about that; I didn’t realize that was the case the post that you suggested sounds like very similar symptoms. I expect only the kernel versions are slightly newer?

I’ve attached the acpidump file and dmesg (since I noted that it was requested on the link provided).

I’m happy to follow up on the bumblebee thread should the above suggest thats where the problem lies

after an hour no scan completed tonight. I highly doubt im infected as this is a brand new clean install… Calling it a night well after midnight.

agto

Where did you attach the ACPIdump? Or did the forum software not let you attach it because it thinks it’s ‘infected’? Happens more often lately. Use https://gist.github.com/
or https://bugs.launchpad.net/lpbugreporter/+bug/752542

Yeah, the forum virus-scanner ate the acpidump.txt.gz file and I don’t think there’s an admin interface to get it back. Sorry!

I don’t see anything obvious in the attached bug report from the debug kernel. Is the system accessible via SSH when it hangs? One other option might be to configure netconsole to log to a different computer during the hang. Since this is a laptop, the last-resort fallback of a serial console probably isn’t possible.

I figured that there was a problem with the forums’ antivirus scans.

I’ll use one of the other links suggested by generix tonight.

Thanks to both of you for following up with me.

Please see the files here
https://gist.github.com/gavinto/11de22af2b18f6da793e4f2df434cccd

Please try
acpi_osi=! acpi_osi=“Windows 2009”
as kernel parameters and see if hangs are gone with it, generate a new nvidia-bug-report.sh.
Since you have an ASUS, it is likely that any of backlight control/touchpad/function keys aren’t working afterwards. You can read this to the end to get a clue what to do then:
https://devtalk.nvidia.com/default/topic/1001621/linux/prime-select-nvidia-gt-log-off-doesn-t-work-but-reboot-does-on-375-quadro-m1000m-dell-precision-550/

@aplattner: Can you find someone with pcie/drivers expertise to have a look at this:
https://bugzilla.kernel.org/show_bug.cgi?id=156341
Something in kernel is really broken regarding pcie power management and nobody seems to have a clue what is wrong. A rapidly rising number of sky/kaby lake laptops with nvidia gpus are affected and there’s not always an easy workaround. I’m fearing that once Windows 7 is not supported by new laptop’s firmwares, this will break completely.
One noteworthy symptom is that all claim not to support ASPM. Is this a change in policy to enable this only on driver level so that Linux breaks something by disabling it altogether?

ok, so I modified /etc/default/grub file

GRUB_CMDLINE_LINUX_DEFAULT=“resume=/dev/disk/by-uuid/fd4f92b1-b6e9-4bee-972e-6dc8f92fa47a splash=silent quiet showopts” acpi_osi=!acpi_osi=“Windows 2009”

I then ‘grub2-mkconfig -o /boot/grub2/grub.cfg’ as instructed, and nothing major happened. same behaviour as before

Earlier in the evening I tried GRUB_CMDLINE_LINUX_DEFAULT=“resume=/dev/disk/by-uuid/fd4f92b1-b6e9-4bee-972e-6dc8f92fa47a splash=silent quie t showopts” “acpi_osi=!acpi_osi=Windows 2009”
forgetting to run grub2-mkconfig, and got a bunch of new PCIe error messages that I’d never seen before.

previous incarnations of opensuse (13.x) provided command access at boot so this method is pretty different for me. I’m sure that I have a typo or syntax error somewhere here.

Please try

GRUB_CMDLINE_LINUX_DEFAULT="resume=/dev/disk/by-uuid/fd4f92b1-b6e9-4bee-972e-6dc8f92fa47a splash=silent quiet showopts acpi_osi=! acpi_osi=\"Windows 2009\""

thanks for this. I got a bit further this time and am indeed using default kernel

No nvidia driver though (touch pad works so based on your comments I figured as much)

nvidia-bug-report.sh --safe-mode

nvidia-bug-report.sh will now collect information about your
system and create the file ‘nvidia-bug-report.log.gz’ in the current
directory. It may take several seconds to run. In some
cases, it may hang trying to capture data generated dynamically
by the Linux kernel and/or the NVIDIA kernel module. While
the bug report log file will be incomplete if this happens, it
may still contain enough data to diagnose your problem.

Please include the ‘nvidia-bug-report.log.gz’ log file when reporting
your bug via the NVIDIA Linux forum (see devtalk.nvidia.com)
or by sending email to ‘linux-bugs@nvidia.com’.

Running nvidia-bug-report.sh…ls: cannot access ‘/proc/driver/nvidia/./gpus/’: No such file or directory
complete.

nvidia-bug-report.log.gz (60.6 KB)

looks like the scanner is acting up again
please see

The last logs were useful. Still not looking perfect, though. Two things:

  1. Your bumblebee install looks misconfigured or not configured at all, it’s loading the nvidia glx module on the intel xserver which doesn’t work.
  2. Your bios seems buggy so it would need kernel quirks which usually don’t get backported so the 4.4 kernel of Leap might not be the best choice for your hardware.

Bumblebee config is beyond the scope of this forum, you might purge it and try with a plain PRIME setup for further debugging.
A newer kernel might be worth trying to see if kernel oopses vanish.

Thanks for this,

I’ll try to update kernel and sort out bumblebee config.

I’ll investigate the Prime option as well.

Should I have success or require further debugging I’ll be sure to follow up on this thread.

If my help is needed to test solutions to the pcie energy management woes, let me know.