RTX 4080 goes off bus once after cold start

I’m facing a weird behavior on my new system:

  • AMD Ryzen 9 7950X 16-Core
  • Gigabyte X670E Aorus Master
  • DDR5 Corsair Vengeance 5200 MHz 16 GB
  • PNY Nvidia GeForce RTX 4080
  • PSU Corsair 850W

The PC is two months old.
I have a dual boot with Ubuntu 23.04 and Windows 11.
In both OS I have serious issues the first time the system boots after a power off.

So, I have my PC off. I turn it on and boot either Linux or Windows. In Linux after 1-3 minutes from start, the whole GUI freezes forever, while the PC is still reachable via SSH. dmesg says:

GPU has fallen off the bus

I can only press the reset button (i.e. reboot via SSH is not executed).
After rebooting, I can work the whole day without any problem, with renderings or other application that uses a lot of GPU. But every time I turn off the PC and boot this error happens. Always.

I tried all the available driver in Ubuntu (I can’t insert two images…). Anyway:

  • nvidia-driver-525
  • nvidia-driver-525-open
  • nvidia-driver-525-server
  • xserver-xorg-video-nouveau

as well as the latest ones downloaded from the nvidia website. The behavior does not change.
I tried to use wayland or xorg.

In Windows, the issue is slightly different, but I guess has the same root cause.
When happens, the screens go black and after some minutes it reboots itself. I checked in the event viewer but I found as critical error only the notification that the system was restarted abruptly.

Here, the issues fires within few minutes after a cold start, but on next reboots, it still freezes, after longer time (i.e. half and hour, or even 1-2 hours).

In both cases it happens even staring at the desktop with no application running, with CPU and GPU close to 0%. This should exclude a PSU problem to me.

Here what I tried so far:

  • moved the RAM module into the other slot (i.e. from A2 to B2)
  • ran memtest86+ several times (10+ passes)
  • tried all the available drivers in Ubuntu
  • tried to switch between xorg and wayland
  • updated the system (APT in Linux or Windows Update)
  • reinstalled Ubuntu from scratch
  • reinstalled Windows from scrath
  • updated BIOS to the latest version
  • added pcie_aspm=off kernel option
  • removed the card from the slot and inserted again

Please, may you help me to fix this problem? The system is unusable in this way (especially in Windows).