I have always encountered Xid 62 issue on ubuntu 20.04 with RTX 3090.
Each time when I run nvidia-smi(it outputs No devices were found), an Xid 62 error is logged into dmesg like this
[ 260.970401] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
[ 260.970523] caller os_map_kernel_space.part.0+0x73/0x80 [nvidia] mapping multiple BARs
[ 261.825963] NVRM: GPU at PCI:0000:01:00: GPU-64588dc5-df58-792c-c418-b5d69e1102e5
[ 261.825967] NVRM: GPU Board Serial Number:
[ 261.825971] NVRM: Xid (PCI:0000:01:00): 62, pid=1904, 0000(0000) 00000000 00000000
[ 269.925557] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x53:0x65:2109)
[ 269.925661] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Here is my hardware and environment
MOBO: ASRock Rack EPYCD8-2T
CPU: AMD EPYC 7402
GPU: GeForce RTX 3090 (with driver 455.38)
OS: Ubuntu 20.04.1 LTS server (with kernel 5.4.0 and 5.8.0), Windows 7
I had no issue with my hardware and environment with RTX 2080 Ti and Quadro RTX 4000 with driver 450.80 and 455.38 before upgrading the graphics card to RTX 3090.
I also tested the same hardware configuration with Windows 7, it turned out Windows 7 can not auto update the driver for 3090, and I can not even manually install driver 457 on Windows 7.
I have tested RTX 3090 on my 2 other PCs(1 with Windows 7, 1 with Ubuntu 20.04 server), it worked well on both PCs.
So the only possible is that the driver issue, or is RTX3090 incompatible with the motherboard.
nvidia-bug-report log attached.
I can provide ssh access to the server if needed.
A similar bug report can be found here but without answer: AsRock ROMED8-2D + RTX 3090 black screen on Ubuntu 20.04
I also read this post: Random Xid 61 and Xorg lock-up
But the xid I got is 62, not x61, and xid 62 error always happens at each time when I try to run nvidia-smi or other commands which tries to talk to the graphics card.
nvidia-bug-report.log.gz (130.4 KB)