NVRM: RmInitAdapter failed!

bartax2qq · March 10, 2021, 10:01pm

Hello,

We have three PCs, one has a 2080 Ti GPU and the other two have a 3090 GPU. All three are fine when I load CentOS 8.2, until I install CUDA or the GeForce RTX drivers from the NVIDIA web site. Once I install those, the GPUs don’t always initialize during boot and give the following errors:

[ 1.591037] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 460.32.03 Sun Dec 27 19:00:34 UTC 2020
[ 15.124144] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 15.124176] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
[ 15.169570] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x24:0xffff:1248)
[ 15.169588] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0
[ 24.491038] NVRM: GPU 0000:0a:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
[ 24.491071] NVRM: GPU 0000:0a:00.0: rm_init_adapter failed, device minor number 0

It does not happen during every boot, but on average at least once every three reboots. I’ve exhausted all BIOS and kernel tweaks I could find. I tried Asus and Gigabyte motherboards and Asus and Gigabyte branded 2080 Ti GPUs. The problem persists across all.

Any suggestions?

Thank you,
Bart

nvidia-bug-report.log.gz (511.1 KB)

njuffa · March 11, 2021, 12:52am

I am wondering how you established that they were fine, without a driver installed? Are you saying none of these GPUs works correctly for you, and there is no pattern (problem seems specific to a particular machine or specific to a particular GPU) when you exchange the GPUs cyclically between machines?

bartax2qq · March 11, 2021, 1:52am

The cards display without issues with the in-kernel drivers and no NVRM errors occur in dmesg during boot.

And yes, no particular pattern. The problem occurs for 3 different 2080 Ti cards and 2 different RTX 3090 cards. Also tried 3 Gigabyte motherboards and 1 Asus motherboard.

I also tried with 4G Encoding enabled and disabled in the BIOS.

njuffa · March 11, 2021, 3:15am

I have never encountered case like this. So no idea what could be going on. I assume you followed all the standard procedures like blacklisting Nouveau drivers etc.

You might want to consult a local expert who can take a look at these systems. There might be something about this situation that is not reflected here, because it did not seem relevant, but becomes obvious if someone is physically in front of the machines.

scaled · December 16, 2022, 5:25pm

I have external RTX 3090, connected via thunderbolt. In original drivers it doesn’t initialise at all, but there is a solution.

This open kernel modules works fine with RTX 3090, but they lacks some power saving features, + it have some problems with S3 sleep.

I run it with options nvidia NVreg_OpenRmEnableUnsupportedGpus=1 kernel module parameter.

Topic		Replies	Views
RmInitAdapter failed (repeatedly) for one of two RTX2080TI on Ubuntu 18.04 CUDA Setup and Installation	6	2775	August 14, 2020
Unable to install Tesla V100 GPU drivers on Ubuntu 20.04 Tesla Boards cuda , kernel , ubuntu	1	1257	February 22, 2024
Rm_init_adapter failed for brand new RTX 2080 Super on Ubuntu 18.04 CUDA Setup and Installation	8	2695	October 12, 2021
Centos 7 7.7.1908 - NVIDIA driver 430.50 - GeForce RTX 2080 Ti - RmInitAdapter failed (0x26:0x65:115... Linux	1	1159	September 19, 2019
Suddenly lost one of four GPUs Linux hw , kernel	0	816	June 5, 2022
NVRM: GPU 0000:86:00.0: RmInitAdapter failed! (0x25:0xffff:1451) Linux	2	503	May 4, 2023
rm_init_adapter failed!!! help me!!! CUDA Setup and Installation	0	769	February 21, 2020
Ubuntu 19.10 kernel5.3.0-29-generic NVIDIA driver 435.21 - GeForce GTX 1050 Ti - RmInitAdapter failed (0x26:0x65:1226) Linux	4	1141	May 31, 2020
2 GPUs installed on system, but only 1 attached by nvidia-smi CUDA Setup and Installation	1	1207	March 23, 2019
3 devices (1 intel 2 nvidia) but nvidia-smi only shows 1 GPU CUDA Setup and Installation	1	1089	July 2, 2017

NVRM: RmInitAdapter failed!

Related topics