Nvidia 331.38 frequent Ubuntu 13.10 freeze , GTX 780M Kernel 3.11

Hi,

I am experiencing frequent freeze on system.

I have kernel 3.11 and Driver version 331.38

Every time i have to hard boot by turning off power. I did ram memory tests it says all good.

Can you please help me to debug this issue.

Where does nvidia stores logs which i can share with you to debug?

I am uploading syslog , kernel.log , x logs , nvidia bug logs for now.

Thanks in advance.

[url]File Dropper - Online file sharing

[url]File Dropper - Online file sharing
nvidia-bug-report.log.gz (202 KB)
nvidia_logs.tar.gz (801 KB)

Same trouble.
When i’m play games in Wine or native Linux games, i get freeze.
In kernel.log get this errors:

[551436.958423] NVRM: GPU at 0000:01:00: GPU-d26810d6-e102-b82c-b81a-b8c381aa8ec4
[551436.958443] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551439.951849] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551446.972599] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551452.973070] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551453.229500] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 0000080c 00050004 0000000c
[551453.229791] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551453.492553] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551453.752141] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551460.476867] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 00002380 00050004 0000000c
[551460.477195] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551461.248422] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551461.248837] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551461.506485] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551461.506944] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551467.990072] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551468.510649] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551468.763458] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[551468.763759] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 0000080c 00050004 0000000c
[551469.022459] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552304.124606] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552304.125031] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552305.951771] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552305.952116] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552306.150958] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552306.151537] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 0000033c 04ab04aa 0000000c
[552315.093972] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552315.094419] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 0000035c 03580356 0000000c
[552315.882304] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000
[552315.882743] NVRM: Xid (0000:01:00): 13, 0003 00000000 00009097 00000124 012a0127 0000000c
[552315.882962] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 00040000

Kernel:
3.12.6-2 x86_64

System:
Debian Jessie

Nvidia drivers:
Any, current installed 331.20

Video card:
01:00.0 VGA compatible controller: NVIDIA Corporation GF116 [GeForce GTX 550 Ti] (rev a1)

Xorg:
X.Org X Server 1.14.5
Release Date: 2013-12-12

I have this trouble over a year…
Sorry for my English.

EDIT: my problems went away (knocking on wood) by turning PowerMizer in max-performance mode (on nvidia x server settings, image below).

Detailed logs with the problems I had: (EE) [mi] EQ overflowing. Additional events will be discarded until existing ev - Pastebin.com

Original post: http://pastebin.com/raw.php?i=gj6Nu9xJ

nvidia-bug-report.log.gz (92.1 KB)

Seems, i’m solved this problem.
First i’m enable IOMMU in bios, after i’m edit grub config(/etc/default/grub):

GRUB_CMDLINE_LINUX_DEFAULT="quiet nomodeset noresume iommu=noaperture clocksource=hpet"

do sudo update-grub
Yesterday i’m play Starcraft II all day and all ok, in kern.log and Xorg.0.log i don’t get any erros.
Today i run Heaven Benchmark over hour, and don’t get freeze.
Driver Version:304.119

UPD.
Just, i get this ***** error

Launch Heaven Benchmark

kern.log

Feb  4 13:25:26 izero kernel: [82792.149604] NVRM: GPU at 0000:01:00: GPU-d26810d6-e102-b82c-b81a-b8c381aa8ec4
Feb  4 13:25:26 izero kernel: [82792.149624] NVRM: Xid (0000:01:00): 8, Channel 00000003

Xorg.0.log

(EE) Backtrace:
(EE) 0: /usr/bin/X (xorg_backtrace+0x3d) [0x7f2bcff89ccd]
(EE) 1: /usr/bin/X (QueuePointerEvents+0x52) [0x7f2bcfe53802]
(EE) 2: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f2bc8028000+0x571d) [0x7f2bc802d71d]
(EE) 3: /usr/bin/X (0x7f2bcfde8000+0x91ab8) [0x7f2bcfe79ab8]
(EE) 4: /usr/bin/X (0x7f2bcfde8000+0xb9fb0) [0x7f2bcfea1fb0]
(EE) 5: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f2bceec8000+0xf210) [0x7f2bceed7210]
(EE) 6: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0x9743d) [0x7f2bc8baf43d]
(EE) 7: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0x98394) [0x7f2bc8bb0394]
(EE) 8: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0x11c90a) [0x7f2bc8c3490a]
(EE) 9: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0xd3722) [0x7f2bc8beb722]
(EE) 10: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0x5f917c) [0x7f2bc911117c]
(EE) 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f2bc8b18000+0x5c68a9) [0x7f2bc90de8a9]
(EE) 12: /usr/bin/X (BlockHandler+0x44) [0x7f2bcfe41034]
(EE) 13: /usr/bin/X (WaitForSomething+0x124) [0x7f2bcff871f4]
(EE) 14: /usr/bin/X (0x7f2bcfde8000+0x54d61) [0x7f2bcfe3cd61]
(EE) 15: /usr/bin/X (0x7f2bcfde8000+0x4455a) [0x7f2bcfe2c55a]
(EE) 16: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf5) [0x7f2bcdb09995]
(EE) 17: /usr/bin/X (0x7f2bcfde8000+0x4489f) [0x7f2bcfe2c89f]
(EE) 
[ 82651.813] (WW) NVIDIA(0): WAIT (1, 6, 0x8000, 0x0000ad98, 0x0000b838)
[ 82651.813] [mi] Increasing EQ size to 2048 to prevent dropped events.
[ 82651.814] [mi] EQ processing has resumed after 1501 dropped events.
[ 82651.814] [mi] This may be caused my a misbehaving driver monopolizing the server's resources.

i don’t know why nvidia hate me))
[This file was removed because it was flagged as potentially malicious] (79.6 KB)
nvidia-bug-report.log.gz (87.5 KB)

I disable Composite module, and now all ok.
Try it and inform me please.

Section "Extensions"
    Option         "Composite" "Disable"
EndSection