A thought … K600 card works fine, K2000 produces above errors. Does this really make sense with your hypothesis that it’s kernel/driver preemption?
I don’t even think that preemption is the sole reason, that’s not a new option. Looks more like some timing issue to me maybe triggered by a gcc/preemption combo. So a different gpu might have different issues with timing.
Sadly, I’m getting the same error with voluntary preemption. Building a no preempt kernel right now. Mind you, I didn’t rebuild the nividia driver, just booted from a kernel I built from SuSE’s sources with the preemption config var changed.
No go for no-preempt kernel, too. I also tried re-installing the nvidia driver on the no-preempt kernel (390.48) to no avail. Should I ditch the card and get something like a GTX1050?
Sad. In a similar case with a custom kernel, it boiled down to that kernel option. Did you test that board in another system?
Yes same problem in a Dell xeon workstation and a HP Z820 workstation. Both running openSUSE tumbleweed, and fully updated. Meanwhile, a K600 works just fine in the Dell.
Having the same issue on Lenovo P50 with
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2)
Its an odd problem, my T530 has no problem under bumblebee:
01:00.0 VGA compatible controller: NVIDIA Corporation GF108M [NVS 5400M] (rev ff) (prog-if ff)
I finally solved my desktop problem by retiring the K2000 and buying a Radeon WX3100 … latest silicon, runs great on a fully open source stack.
To confirm I also have a Lenovo P50 with the following configuration:
Release:
Red Hat Enterprise Linux Workstation release 7.4 (Maipo)
Kernel:
3.10.0-693.21.1.el7.x86_64
Video:
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2)
NVIDIA Driver Versions tested (from .run installer off nvidia.com):
384.98
390.48
396.18
I am receiving this log message in /var/log/messages when attempting to load X (via gdm):
Apr 14 09:19:14 oc6815873887 kernel: NVRM: failed to copy vbios to system memory.
Apr 14 09:19:14 oc6815873887 kernel: NVRM: RmInitAdapter failed! (0x30:0xffff:669)
Apr 14 09:19:14 oc6815873887 kernel: NVRM: rm_init_adapter failed for device bearing minor number 0
To confirm the kernel module loads correctly, the issue is when starting X.
Logs from /var/log/Xorg.0.log
[ 553.366] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please
[ 553.366] (EE) NVIDIA(GPU-0): check your system's kernel log for additional error
[ 553.366] (EE) NVIDIA(GPU-0): messages and refer to Chapter 8: Common Problems in the
[ 553.366] (EE) NVIDIA(GPU-0): README for additional information.
[ 553.366] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device!
[ 553.366] (EE) NVIDIA(0): Failing initialization of X screen 0
[ 553.366] (EE) Screen(s) found, but none have a usable configuration.
[ 553.366] (EE)
[ 553.366] (EE) no screens found(EE)
[ 553.366] (EE)
[ 553.366] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[ 553.366] (EE)
[ 553.370] (EE) Server terminated with error (1). Closing log file.
benjamin5sbqi, see this:
[url]https://devtalk.nvidia.com/default/topic/1031428/linux/390-42-centos7-4-3-10-0-693-21-1-el7-x86_64-nvidia-smi-gives-quot-no-devices-were-found-quot-/post/5247647/#5247647[/url]
I finally got this fixed…
I had to upgrade to a third party kernel build on “elrepo” and build from the latest beta drivers version 396.18 available on nvidia.com.
I followed the instructions to upgrade my Centos 7.4 system to a 4.16 kernel (including the devel and headers packages not mentioned in the instructions) here: [url]https://www.tecmint.com/install-upgrade-kernel-version-in-centos-7/[/url]
After that I just installed from the standalone nvidia installer and all seems to be working well now. I have two 1920x1080 displays working well with 3d rendering functioning, as is detaching and re-attaching from displays.
I’ve been having the same problem.
I tried updating the video driver and other solutions here but I only eventually solved it by updating the video card firmware. I was able to use the nvflash_linux utility from techpowerup here: NVIDIA NVFlash (5.735.0) Download | TechPowerUp
To get the rom file I downloaded and unzipped the bios update utility from the manufacturer(in my case Asus Strix gtx970).
I flashed the rom from the manufacturer even though it was the same version as on my card. To use the utility you need to exit X server and unload (rmmod) all nvidia modules(in my case nvidia_drm, nvidia_modeset and nvidia).
Now the problem seems to be gone, I have no more NVRM messages in dmesg.
Hope it helps someone.
Hi. I have the same problem. I saw the answers above. But I don’t know which version of kernel/driver I need to update. How do you know? Can someone tell me?
OS: Ubuntu 16.04
Kernel: 4.4.0-142-generic
Model: Tesla T4
NVRM version: 450.51.06
CUDA version: 11.0
nvidia-bug-report.log (2.0 MB)
Neither a new kernel no driver will help, your gpu is broken. You can try to reflash your vbios as well but I suspect the issue is a bit more serious.
RmInitAdapter failed! (0x25:0x51:1238)
rm_init_adapter failed, device minor number 0
Failed to copy vbios to system memory.
RmInitAdapter failed! (0x30:0xffff:794)
Hello, I get this error
RmInitAdapter failed! (0x25:0x51:1238)
rm_init_adapter failed, device minor number 0
Failed to copy vbios to system memory.
RmInitAdapter failed! (0x30:0xffff:794)
on GeForce GTX 660M with driver version 470 in openSUSE Tumbleweed (kernel version 6.x…) since this new kernel was installed. I do not get this error in Ubuntu 20.04, openSUSE Leap 15.4 (kernel version 5.x…). (Both driver version 470 too). Does it please mean a hardware error ?
log attached:
nvidia-bug-report.log.gz (740.1 KB)
Solved after driver update.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.