Nvidia Drivers and CPU Stuck

I just bought a Nvidia Tesla K80 for my proliant ml310e v2 gen8 server, but I can’t properly install the drivers.
First of all, I am using a Headless Ubuntu 20.04 Server, several times I install and purge nvidia drivers to try if one of them works for me.

I tried: nvidia-driver-470, nvidia-headless-470, and it doesn’t work.

When I try to run nvidia-smi I always get:

Message from syslogd@datacenter at Sep 12
13:49:02 … kernel: [47787.904399] watchdog: BUG: soft lockup - CPU 0 stuck for 22s! [nvidia-smi:2563]

And the principal terminal returns:

:
[163.011774] cloud-init [1090]: 2022-09-12 00:35:17,115

  • cc_final_message.py [WARNING]: Used fallback
    datasource [47732.6754231 genirq: Flags mismatch irq
  1. 00000080 (nvidia) vs. 00000000 (eno2-tx-0)

Do you know if it is some configuration, or something that I am not taking into account?

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Hi @generix, thanks for helping

nvidia-bug-report.log.gz (49.8 KB)

Just for clarification are you using nvidia-headless-470 or nvidia-headless-470-server
If you haven’t I would suggest trying the -server version.

I’m using de nvidia-headless-470, perfect, let me purge all nvidia and try with the server version.

[    0.170820] pci 0000:06:00.0: can't claim BAR 1 [mem 0xf5ff000c00000000-0xf5ff000fffffffff 64bit pref]: no compatible bridge window
[    0.170820] pci 0000:07:00.0: can't claim BAR 1 [mem 0xf8bf000c00000000-0xf8bf000fffffffff 64bit pref]: no compatible bridge window

Please enable above 4G decoding/64bit BARs and disable CSM in bios, then reinstall the OS in EFI mode.