Xorg unable to recognize/boot with Nvidia Driver

Hi all,

I have searched all over the internet for solutions to the problems I have been having but have had no luck, which is why I am creating this topic. I hope this is the correct forum.

I have a computer server with 4x NVIDIA RTX GeForce 2080 Ti that runs on CentOS 7 using GNOME Desktop environment. For some odd reason, after installation of any NVIDIA driver (From elrepo’s package and Nvidia website) the computer will freeze at boot or after login. Usually, it is the former where the screen shows GDM service starting and then the computer will hang. First off, when I run Startx, I receive and error that serverauth.XXXXX does not exist and that Xinit cannot reach the server. Secondly, in my Xorg.0.log file, there is the following Error reported: “(EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:104:0:0.” and “EE) Screen(s) found, but none have a usable configuration. [ 25.346] (EE) Fatal server error: [ 25.346] (EE) no screens found(EE)” at the end of the file.

Could someone please help me with this issue? It seems like maybe the xorg.conf file needs to be edited in a way that it recognizes the GPUs.

Attached are my lshw, nvidia bug report, and Xorg.0.log outputs for reference. (NOTE: when I last ran nvidia-smi it shows 3 GPUS, one was accidentally disconnected at the time, so it is there)

Thanks in advance!

Xorg.0.log (10.6 KB) nvidia-bug-report.log.gz (2.1 MB) lshw.txt (2.0 KB)

Just posting this reply for anyone else who will encounter this problem: It turns out one of the GPUs wasn’t assigned a UUID (Top most card at PCI:104:0:0 (PCI:68:0:0)) and so we are creating an RMA with the company we ordered it from. Simple fix for this is to check information by issuing vi /boot/driver/nvidia/gpus/YOUR PCI HERE/information. Simply taking out the defective card and substituting in another fixed the problem.