Using a Dell Precision Rack 7920 computer along with an RTX 6000 I get the following error every time I reboot:
“UEFI0077: One or more PCIe device errors occured in the previous boot.
Check the System Event Log (SEL) to identify the PCIe device with errors, and then update its firmware.”
The SEL contains the same error message (UEFI0077) as well as another entry: “PCI1318 - A fatal error was detected at bus 59 device 0 function 2.”
At the same time that the message pops-up, the LED on the front left of the Dell switches from Blue to Orange.
You can press F1 to continue booting and graphics card will work properly, but basically every reboot will require user intervention to complete.
Some data points:
Machine is running CentOS 7.4.
If I do a full shutdown instead of a reboot, I don't get the error.
Trying a different graphics card (P6000 or RTX 8000), I don't get the error.
I've updated the workstation BIOS and iDRAC versions to the latest available.
The same card used in a different system model does not cause any similar glitch.
The current VBIOS on RTX 6000 is 90.02.15.00.04 - is there a newer VBIOS that I could possibly try?