X session fails to start on laptop with NVIDIA driver

This is follow-up for this thread:
https://devtalk.nvidia.com/default/topic/935099/linux/-ee-nvidia-0-failed-to-enter-vt-mode-initialization-failed-with-quot-load-legacy-option-rom-qu/post/5006084/

  • Hardware: Alienware M17x R5 laptop with a few modifications done by the manufacturer relative to the base model (144hz display)
  • Video: GeForce GTX 780M
  • OS: Debian (testing/unstable/experimental repositories are added, testing is the highest priority, thus all the system is testing with a few manual overrides from unstable)
  • NVIDIA drivers: 375.82 (provided by the nvidia-driver 375.82-4 debian package)

Issue appeared during driver upgrade from 361.45.18-3 to 367.44-3 (rollback solved this issue at that time). Issue can be observed in two different manifestations:

  • After a ‘cold’ boot-up, when the system starts normal way. The majority of system services are initialized properly, X attempts to load, sometimes a cursor appears, but then it closes (crashes?) and another launch of X is attempted. The process is repeated in an endless cycle, the system by itself is unable to get out of the deadlock. Visually it looks like the screen goes completely black (backlight disabled?), then it gets a bit brighter (looks like a regular black background when the backlight is on), then goes completely black again etc. There’s a possible workaround: close the laptop lid (puts the laptop into a suspend mode) and wait until it’s suspended, open the lid (resume), then there’s some chance (let’s say 20-30%) that X will boot up properly.
  • After I’m able to log into my user session (gnome 3 if it matters), when I want to put laptop off, I can suspend it and resume later. On resume, there’s approximately 50% chance that X will be restored properly, and if it’s not, visually it looks the same as previous manifestation, X attempts to start, fails, attempts again. However, this time I’m able to ‘restore’ it not just by putting the laptop into the suspend and resuming, but by switching TTYs forth and back, with each such switch having approx. 50% chance of getting X starting properly. Thus after resume, it’s much easier to achieve.

syslog1: https://www.dropbox.com/s/skc4va2xpt5e91o/syslog.log?dl=0
syslog2: https://www.dropbox.com/s/vtuohhh0ewdw6eo/syslog2.log?dl=0 (before 21:30:14 - cold boot, multiple attempts to start X and suspend, after 21:34:12 - successful resume from 1st attempt)
nvidia bug report log: https://www.dropbox.com/s/cupbnlwsdjcca55/nvidia-bug-report.log.gz?dl=0

I included a full syslog, just in case there’re non-nvidia lines that matter. It includes afull ‘cold’ system boot-up, several unsuccessful X start attempts, suspend, and resume. After the 1st resume X booted up as expected and I was able to log into my GUI user session. If there’s anything else i can add which can help to debug this issue, please do tell.

Previously (1 year ago when this bug started manifesting), log entries were different, using them I found linked issue. It was speculated that issue had something to do with display power state when X is launched, but i could not verify this claim, as I have no control over display power.