problems with recent drivers on old hardware

I have already discussed this with the help desk. The only solution that we found was to use old drivers. I want to post it here so others having the same problem can reply.

I am using 8400GS hardware and I have been experiencing short system hangs with the newer drivers. 295.71 is the latest driver that doesn’t do it at all. That is all that I want to say about it now. I will give more information later.

NOTE ADDED 7/28/2013: new versions of various software have made installing the old driver (295.71) no longer possible. the problem persists and seems to simply be that the new drivers expect too much from the old graphics hardware. please suggest a distro that will allow me to install a driver that is compatible with my hardware.

ANOTHER NOTE 7/31/2013: I managed to get 295.71 installed on an older kernel by copying 32bit libraries to application directory. Performance is the same, not good when adapter memory is depleted, but there are no hangs.

it just now happened with 295.71. aiiiieeeeeeeeeee

I don’t know if my problem is the same, but with anything 300 series (302, 304, up to 310.14) I have short hangs exactly every 5 seconds.

However, my hardware is a GTX 570. Not the latest and greatest, but not “old hardware” yet…

So if it’s not hardware, maybe it’s something in our systems? Older kernel? Older glibc? Other library? It’s hard to tell without installing different distros to check…

I’m running 295.75 now, used to run 295.71 like you before. Anything 3xx is broken.

I’ve also sent a bug report to linux-bugs@nvidia.com, even got a reply but only asking for more details about my system. No resolution so far.

SteveBean, Are you also facing issue with all 3xx.xx drivers ? I think 295.75 working for Lamieur, for you its working?

SteveBean, Lamieur, Such kind of issues little bit tricky to reproduce.

Please provide information as much as you can get to reproduce the issue :

  •   Bug report file by running nvidia-bug-report.sh script as root. 
    
  • Reproduction steps in detail
  • Desktop Environment you are using like KDE, GNOME, Unity etc…
  • Window Manager you are using Compiz, gnome-shell, Kwin, Unity etc…
  • Is the use is specific to GPU, System hardware, OS , Software Component etc…?
  • Using any display manager like gdm, kdm etc…

yes, i have the problem with all 3xx.xx drivers. it hadn’t happened with 295.75 for a few days, then it did. i changed to 295.71 and it hasn’t happened since.

the subject line of my email conversation with Nvidia Customer Care is:
Linux issue [Incident: 120920-000205]

I had issues with the 3xx series as well. I was testing out CUDA 5, but one machine with some M2090s would intermittently have troubles where the system log would report that the GPUs had “fallen off the PCIe bus”. After that, the kernel would start reporting errors about a hung CPU, which presumably corresponded to the kernel thread that had been executing the nv driver. I would have to reboot the node to recover. I rolled back to 295.59 and the problem went away. It is difficult to reproduce though - it happened 3 times over the course of a day, but has never happened with the 295 series.

my typical error messages are like these:

[416049.552468] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416051.552492] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416080.827016] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416082.827032] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416285.067127] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416287.067140] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416489.662749] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
[416491.662773] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

they always happen in pairs, 2 seconds apart. some drivers will hang the system hard enough that the networking will be interrupted and there will be eth0 down/up messages after each pair.

it happens with 304.64 too.
it still hasn’t happened with 295.71.

I’ve had the same thing happen when using Adobe Flash and 3D heavy applications. This showed up in the Xorg log at the same time.

(EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/X (xorg_backtrace+0x34) [0x5965a4]
(EE) 1: /usr/bin/X (mieqEnqueue+0x263) [0x5772a3]
(EE) 2: /usr/bin/X (0x400000+0x4fc84) [0x44fc84]
(EE) 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fef3e351000+0x6208) [0x7fef3e357208]
(EE) 4: /usr/bin/X (0x400000+0x7a297) [0x47a297]
(EE) 5: /usr/bin/X (0x400000+0xa52c7) [0x4a52c7]
(EE) 6: /lib64/libpthread.so.0 (0x7fef44977000+0x10b80) [0x7fef44987b80]
(EE) 7: /usr/bin/X (0x400000+0x19a880) [0x59a880]
(EE) 8: /lib64/libpthread.so.0 (0x7fef44977000+0x10b80) [0x7fef44987b80]
(EE) 9: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7fef3ef79000+0x584c66) [0x7fef3f4fdc66]
(EE) 10: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7fef3ef79000+0x588327) [0x7fef3f501327]
(EE) 11: /usr/bin/X (0x400000+0x1165c1) [0x5165c1]
(EE) 12: /usr/bin/X (0x400000+0x3b031) [0x43b031]
(EE) 13: /usr/bin/X (0x400000+0x29b1a) [0x429b1a]
(EE) 14: /lib64/libc.so.6 (__libc_start_main+0xed) [0x7fef4361191d]
(EE) 15: /usr/bin/X (0x400000+0x29e71) [0x429e71]
(EE) 
(EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
(EE) [mi] mieq is *NOT* the cause.  It is a victim.
[134587.802] [mi] Increasing EQ size to 512 to prevent dropped events.
[134587.802] [mi] EQ processing has resumed after 73 dropped events.
[134587.802] [mi] This may be caused my a misbehaving driver monopolizing the server's resources.

EDIT: Kernel 3.5.7-gentoo, driver 304.64 (64 bit), Xorg 1.13.0 (7.4), GeForce GTX 260M

Rolled back to 304.51, doesn’t happen with that driver version.

310.19 does it too.

read this (https://devtalk.nvidia.com/default/topic/522835/linux/if-you-have-a-problem-please-read-this-first/) and attach the log, rename it as JPG

the people in this thread:

https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/973096

are having similar problems.

313.09 does it too

it was exactly as i suspected, an upgrade to a newer kernel fixed it.
NOTE ADDED 7/27/2013: that version, oneiric, is no longer supported. i had to switch to precise or newer and the problem is back.

I am having very frustrating problems with 3XX drivers as well. I still have multiple machines using the 295.20 driver, and these are very stable. Unfortunately, I am having trouble installing the older drivers (295.XX) on any kernel after 2.6. Specifically, the installer script complains about missing the /linux/version.h, and when I provide this header after some fiddling, the script complains about some kernel conflict. Can anyone point me in the right direction? Searching this forum for 295 yields 100+ pages of results… All I want to do is install old drivers (295.XX) on either Mint 15 or openSUSE 12.3, with 3.7 kernel or newer. Architecture is x86_64. Thanks!

i am currently getting the system hangs and the NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context messages in pairs. i am using ubuntu precise with the 3.5 kernel from raring and the 304.88 driver with my 8400GS card in my dual core 8GB 64bit system. i have noticed processes with RT priority, could these be interrupting the atomic or interrupt context routines in the driver?

Nvidia! please give us some information about what can cause this message to be emitted. many of us are having this problem.

Yes, please give us some information. Some of us have put a lot of money into these cards and a lot of work into our code. Linux relies on clarity in error reports and resolution of bugs by providers of proprietary drivers - please throw us a bone!

SteveBean, Please provide information as much as you can get to reproduce the issue :

  • Bug report file by running nvidia-bug-report.sh script as root.
  • Reproduction steps in detail
  • Desktop Environment you are using like KDE, GNOME, Unity etc…
  • Window Manager you are using Compiz, gnome-shell, Kwin, Unity etc…
  • Is the use is specific to GPU, System hardware, OS , Software Component etc…?
  • Using any display manager like gdm, kdm etc…

i gave all that information to nvidia tech support, the subject line of my email conversation with Nvidia Customer Care was Linux issue [Incident: 120920-000205].

i am running 64 bit lubuntu precise with the raring backport kernel ( lxde desktop with lightdm display manager and version 3.5 kernel ). it is happening very often now and i am running 2 3d programs and chrome browser playing a flash live video stream. the only system that i run is my ASUS M3A78-EM motherboard with my 2.6 Ghz dual core AMD processor and 8GB memory. my graphics card is a PNY brand 8400GS with 512 total memory (256 dedicated).

i would be willing to go back to the latest driver that doesn’t hang (295.71) if i could. but if i install it using the nvidia installer my 32bit 3d programs won’t run. if i install 304.88 with the ubuntu installer they run fine. (if i install 304.88 with the nvidia installer my 32 bit 3d programs won’t run)