GPU has fallen of the bus, nvidia-361.28, kernel 4.2.0

Hi everyone!
I’m running a Kubuntu 15.10 installation and trying to use nvidia driver with my GTX970.
So after booting the system after some random time(usually just a few minutes), graphical system completely crashes, I’m not even able to switch to a console, afterwards.
Kernel is still alive though, as I can reboot the machine through SysRQ-Keys.
This is the error i could pull from dmesg.

[ 650.223614] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[ 650.223647] NVRM: A GPU crash dump has been created. If possible, please run
[ 650.223647] NVRM: nvidia-bug-report.sh as root to collect this data before
[ 650.223647] NVRM: the NVIDIA kernel module is unloaded.
[ 650.223732] NVRM: GPU at PCI:0000:01:00: GPU-30cf345c-96d2-271f-6b34-092d80a3cae2
[ 650.223751] NVRM: Xid (PCI:0000:01:00): 48, An uncorrectable double bit error (DBE) has been detected on GPU in the L2 cache at cache 1, slice 0.

Sometimes it happens without the bit error too.
[ 50.934212] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[ 50.934222] NVRM: A GPU crash dump has been created. If possible, please run
[ 50.934222] NVRM: nvidia-bug-report.sh as root to collect this data before
[ 50.934222] NVRM: the NVIDIA kernel module is unloaded.

Usually I would suspect any hardware/power supply issue, but some recent games on the same machine when using a windows 10 installation work just fine, so I’m suspecting it must have something to do with the nvidia-driver on linux? Nouveau driver doesn’t crash either. But that one is not using any hardware acceleration right?

Any help or hints highly appreciated.

Regards
Florian

Hi,

I have seen some other “GPU has fallen of the bus” posts on this forum over the years. Unfortunately, there doesn’t seem to be a straight forward solution for this. Some people reported that reseating the card in the PCIe bus and cleaning the bus did help. However, in that case nouveau and Windows 10 should have problems too. Doesn’t hurt to try it, though.
Is there any kind of predictability in this? Does it happen when you close something? I have seen similar reports related to TeamFortress 2:
https://devtalk.nvidia.com/default/topic/853030/hard-lockups-after-exiting-team-fortress-2-xid-62-/