NVRM: RmInitAdapter failed! - ETH mining server

shafterinc1 · March 2, 2021, 7:07am

Hello all,

This issue has me completely stumped. I have scoured the internet and tried everything that seemed to work for people having a similar issue but to no avail.

In short, 2/6 of the graphics cards I have installed in my ETH mining server cannot be detected by nvidia-smi. One of them was working just fine on the same PCIe port until I installed a new card on a different port. Now that one is fine, but the old one is having problems. The 6th and final card and port have not been confirmed functional.

lspci shows all cards:

01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
04:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
07:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
08:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)

But dmesg shows a failure of RmInitAdapter from NVRM:

[ 75.298825] NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:624)
[ 75.298873] NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 4
[ 75.415957] NVRM: GPU 0000:08:00.0: RmInitAdapter failed! (0x23:0xffff:624)
[ 75.416004] NVRM: GPU 0000:08:00.0: rm_init_adapter failed, device minor number 5

Output of nvidia-smi:

±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 106… On | 00000000:01:00.0 Off | N/A |
| 59% 74C P2 101W / 120W | 4352MiB / 6076MiB | 100% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX 106… On | 00000000:02:00.0 Off | N/A |
| 50% 74C P2 94W / 120W | 4345MiB / 6078MiB | 100% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 2 GeForce GTX 106… On | 00000000:04:00.0 Off | N/A |
| 39% 72C P2 92W / 120W | 4345MiB / 6078MiB | 100% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 3 GeForce GTX 106… On | 00000000:05:00.0 Off | N/A |
| 43% 72C P2 94W / 120W | 4345MiB / 6078MiB | 97% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1037 G /usr/lib/xorg/Xorg 8MiB |
| 0 N/A N/A 1116 G /usr/bin/gnome-shell 1MiB |
| 0 N/A N/A 1446 C ethminer 4337MiB |
| 1 N/A N/A 1037 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 1446 C ethminer 4337MiB |
| 2 N/A N/A 1037 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 1446 C ethminer 4337MiB |
| 3 N/A N/A 1037 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 1446 C ethminer 4337MiB |
±----------------------------------------------------------------------------+

I have already tried running nvidia-persistanced on boot without success. I am almost certain it is not a hardware issue, at least for the card that was working on the same PCIe port. I doubt it is a kernel configuration issue (I haven’t touched my kernel.) Does anyone have any ideas?

I am running an almost fresh install of Ubuntu Server 20.04.2 LTS. Here is my bug report: nvidia-bug-report.log.gz (881.0 KB)

Thanks in advance!

generix · March 2, 2021, 8:52am

Broken risers, wrong pcie gen?

shafterinc1 · March 3, 2021, 2:51am

Highly unlikely. Let me try to paint a picture:

Let | be a PCIe riser, and let g| be a riser with a 1060 installed. This was my configuration initially, with a GPU installed directly on my motherboard:

           g|  g|  |  g|  |  |

This configuration worked perfectly. Now let g|x be a GPU having issues with RmInitAdapter:

         g|  g|  g|  g|x  g|x  |
Device:   1   2   3   4    5  
(device 0 is on board)

I didn’t touch the configuration for 4. This suggests to me that, at least in the case of what is now 4, the issue is not hardware related since it was all working before installing what is now 3.

generix · March 3, 2021, 7:45am

This is calle crosstalk from bad risers.

shafterinc1 · March 4, 2021, 6:59am

Interesting, I have been looking into this and I have come across some discussions talking about how cheap risers can have problems with cross-talk in multi-GPU configurations. I’ll order a couple new ones and see what happens.

shafterinc1 · March 5, 2021, 4:08am

Ok, so I was messing around with the build and strangely enough I can only get 4 to work at a time no matter what configuration I use. All risers are fully functional if they are part of the only 4 that are connected, but when more than 4 are connected it seems like everything in the higher-numbered PCIe slots drop off.

Could this error be caused by a limitation of my hardware? I have the Asrock Z270 Killer SLI motherboard. I have read that other miners have gotten this board to work with at least 6 cards, but I don’t know the details. Is there some sort of software or BIOS limitation in place that I am unaware of? This is seeming too reproducible to me to be cross-talk alone.

generix · March 5, 2021, 9:01am

Yes, you are correct. I’ve taken a deeper look and:

[    0.226100] pci 0000:07:00.0: BAR 1: no space for [mem size 0x10000000 64bit pref]
[    0.226103] pci 0000:07:00.0: BAR 1: trying firmware assignment [mem 0x20000000-0x2fffffff 64bit pref]
[    0.226105] pci 0000:07:00.0: BAR 1: [mem 0x20000000-0x2fffffff 64bit pref] conflicts with System RAM [mem 0x00100000-0x59316017]
[    0.226108] pci 0000:07:00.0: BAR 1: failed to assign [mem size 0x10000000 64bit pref]
[    0.226110] pci 0000:07:00.0: BAR 3: no space for [mem size 0x02000000 64bit pref]
[    0.226112] pci 0000:07:00.0: BAR 3: trying firmware assignment [mem 0x30000000-0x31ffffff 64bit pref]
[    0.226115] pci 0000:07:00.0: BAR 3: [mem 0x30000000-0x31ffffff 64bit pref] conflicts with System RAM [mem 0x00100000-0x59316017]
[    0.226117] pci 0000:07:00.0: BAR 3: failed to assign [mem size 0x02000000 64bit pref]
[    0.226119] pci 0000:00:1c.4: PCI bridge to [bus 07]
[    0.226121] pci 0000:00:1c.4:   bridge window [io  0xa000-0xafff]
[    0.226125] pci 0000:00:1c.4:   bridge window [mem 0xd6000000-0xd70fffff]
[    0.226130] pci 0000:08:00.0: BAR 1: no space for [mem size 0x10000000 64bit pref]
[    0.226132] pci 0000:08:00.0: BAR 1: failed to assign [mem size 0x10000000 64bit pref]
[    0.226135] pci 0000:08:00.0: BAR 3: no space for [mem size 0x02000000 64bit pref]
[    0.226137] pci 0000:08:00.0: BAR 3: trying firmware assignment [mem 0x10000000-0x11ffffff 64bit pref]
[    0.226139] pci 0000:08:00.0: BAR 3: [mem 0x10000000-0x11ffffff 64bit pref] conflicts with System RAM [mem 0x00100000-0x59316017]
[    0.226141] pci 0000:08:00.0: BAR 3: failed to assign [mem size 0x02000000 64bit pref]

Please check your bios for an option “Above 4G decoding” or “large/64bit BARs” and enable it. Normally, the nvidia driver would put out a clearer error message in that case.

shafterinc1 · March 6, 2021, 1:26am

That did the trick! Yeah that was not a very clear error message at all for something so simple, you’re the best generix thanks a million!

Topic		Replies	Views
Failed to allocate NvKmsKapiDevice and Failed to register device (Rocky 9.5. and Kernel 6.12.9) Linux kernel , driver , nvidia-smi	4	307	March 6, 2025
"RmInitAdapter failed" with 378.13 Linux	7	5775	October 15, 2017
linux driver 410.73 gtx 980, NVRM: RmInitAdapter failed! Linux	26	23124	October 19, 2020
RmInitAdapter failed! since kernel > 6.4 Linux kernel	30	4029	May 27, 2025
Ubuntu tesla P40 NVRM: GPU 0000:03:00.0: RmInitAdapter Drivers - Linux, Windows, MacOS kernel , nvbugs	4	1426	March 31, 2023
H100 PCIe, NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver Linux kernel , ubuntu , gpu , driver , nvidia-smi	17	4578	April 12, 2024
NVRM: failed to copy vbios to system memory Linux	36	10990	September 29, 2024
GPU in chassis not being seen by drivers Tesla Boards	5	2226	February 21, 2023
Linux driver 418.56, GTX 1660 Ti, NVRM: RmInitAdapter failed! Linux	7	2068	March 29, 2019
NVRM: This PCI I/O region assigned to your NVIDIA device is invalid Linux	39	17365	October 12, 2021

NVRM: RmInitAdapter failed! - ETH mining server

Related topics