Screen/system is dead on resume (unable to resume with all current drivers)

REISUB = “Raising Elephants Is So Utterly Boring”

I tried that one.

Does your system reboot when you press SysRq + B?

If it does then you should have messages in system log.

Something like:

SysRq : Emergency Sync
Emergency Sync complete
SysRq : Emergency Remount R/O

Yes, it reboots after SysRq + B, but nothing in log before that.

Try SSH’ing into your PC and run dmesg, also try restarting X server (sudo killall -9 X).

↑↑↑

↑↑↑

Aaron, can you please look into this?

Another bump.

Aug  5 02:20:18 localhost kernel: [47802.155504] NVRM: GPU at 0000:01:00: GPU-136382c0-06fa-2c0f-977a-4f04b1755070
Aug  5 02:20:18 localhost kernel: [47802.155508] NVRM: Xid (0000:01:00): 56, CMDre 00000000 00000088 0100cca3 00000007 00000000
Aug  5 02:20:18 localhost kernel: [47802.155519] NVRM: Xid (0000:01:00): 56, CMDre 00000000 0000008c 00000000 00000005 0000102b
Aug  5 02:20:19 localhost kernel: [47803.153910] NVRM: Xid (0000:01:00): 31, Ch 00000000, engmask 00000101, intr 10000000
Aug  5 02:20:21 localhost kernel: [47805.155447] NVRM: Xid (0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00001005

[ 47855.703] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x00008944, 0x0000aa30)
[ 47862.703] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x00008944, 0x0000aa30)

Bump.

Bump.

Bump.

This is getting weird. Windows 7 SP1 64 reports that the NVIDIA driver has crashed on resume.

The description for Event ID 14 from source nvlddmkm cannot
be found. Either the component that raises this event is not installed
on your local computer or the installation is corrupted. You can install
or repair the component on the local computer.

If the event originated on another computer, the display information had
to be saved with the event.

The following information was included with the event: 

\Device\Video7
CMDre 00000000 00000088 ff1fe121 00000007 00000000

Bump.

I am not able to reproduce this issue with GPU GeForce GTX 660 (GK106) + Ubuntu 13.04 x86_64

I know, but can you at least tell me what those XID errors mean exactly?

It has something to do with my GPU but I’ve now idea what can be wrong with it.

Besides like I already told - Windows 7 and 8/8.1 also have this problem.

birdie, Sorry that is confidential information. Do you think its specific to your motherboard or GPU?

I tend to think it’s my GPU since the previous one works here just fine. However I’m quite sure I won’t be able to RMA/replace it since otherwise it works perfectly.

Even though this information is confidential, can you describe in general what this error means without going into specifics? Also you can PM me without making it public.

The newest drivers aren’t any better:

[ 39036.115] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x0000235c, 0x00004448)
[ 39042.116] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x0000235c, 0x00004448)

Oct 14 00:30:46 localhost kernel: [38989.338356] NVRM: GPU at 0000:01:00: GPU-136382c0-06fa-2c0f-977a-4f04b1755070
Oct 14 00:30:46 localhost kernel: [38989.338360] NVRM: Xid (0000:01:00): 56, CMDre 00000000 00000088 0100cb53 00000007 00000000
Oct 14 00:30:46 localhost kernel: [38989.338370] NVRM: Xid (0000:01:00): 56, CMDre 00000000 0000008c 00000000 00000005 0000102b
Oct 14 00:30:49 localhost kernel: [38992.337718] NVRM: Xid (0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00001005
Oct 14 00:30:54 localhost kernel: [38997.335678] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 14 00:30:56 localhost kernel: [38999.333964] NVRM: Xid (0000:01:00): 8, Channel 00000000

birdie, Do you think this issue is specific to MB ASUSTeK P8P67 PRO + CentOS release 6.4 + kernel 3.9.4.
Also try kernel parameter video=vesa:off vga=0 to resolve below issue in your log:

Jun 24 11:39:32 localhost kernel: [ 15.591555] NVRM: Your system is not currently configured to drive a VGA console
Jun 24 11:39:32 localhost kernel: [ 15.591558] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
Jun 24 11:39:32 localhost kernel: [ 15.591560] NVRM: requires the use of a text-mode VGA console. Use of other console
Jun 24 11:39:32 localhost kernel: [ 15.591561] NVRM: drivers including, but not limited to, vesafb, may result in
Jun 24 11:39:32 localhost kernel: [ 15.591563] NVRM: corruption and stability problems, and is not supported.

I’ve tried that - the same result.

BTW, Windows 7/8 also report the same Xid errors when resuming from sleep (and drivers do crash on resume in Windows - but since Windows drivers run in userspace, Windows is able to survive by restarting them), so either my motherboard is incompatible with this GPU (I’m running the newest BIOS) or this particular GPU is buggy. I’ve no idea how to check either of these assumptions since you don’t want to reveal what those Xid errors mean.

BTW, does it make sense to report this issue to ASUS? I suspect they won’t help me citing your NDA policy in regard to these Xid errors - there’s no way ASUS can debug my issue, since the video card is not theirs (it’s made by Gigabyte). I’m very sad.

Can you at least say if these errors are GPU memory related?

There are no utilities at all to check my GPU memory unfortunately.