Failed to copy vbios to system memory

Quadro RTX4000 doesn’t load in system when a iGPU is present. When I install a CPU with no integrated GPU, the nvidia card works fine. When I install a CPU with an iGPU, the iGPU works fine but the nvidia card doesn’t. I can install Gentoo Linux when the displays are connected to either the iGPU or dGPU. If installing when a display is connected to the dGPU (nvidia) the screen goes blank after the first restart (even no tty) This happens before installing Xorg or any drivers. So the problem must be between the kernel, the used modules or nvidia propietary driver. The same system works fine in Windows, so a hardware failure is discarded. Here is some info when the system is booted with the display connected to the iGPU:

uname -r
5.4.38-gentoo

xrandr --listproviders
Providers: number : 1
Provider 0: id: 0x49 cap: 0xb, Source Output, Sink Output, Sink Offload crtcs: 4 outputs: 6 associated providers: 0 name:Intel

journalctl -b 0 -k | grep -i NVRM
Jun 16 20:55:43 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 440.82 Wed Apr 1 20:04:33 UTC 2020
Jun 16 20:55:44 localhost kernel: NVRM: GPU 0000:01:00.0: Failed to copy vbios to system memory.
Jun 16 20:55:44 localhost kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x30:0xffff:755)
Jun 16 20:55:44 localhost kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

journalctl -b 0 -k | grep -i nvidia
Jun 16 20:55:43 localhost kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input17
Jun 16 20:55:43 localhost kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input18
Jun 16 20:55:43 localhost kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input19
Jun 16 20:55:43 localhost kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card2/input20
Jun 16 20:55:43 localhost kernel: nvidia: loading out-of-tree module taints kernel.
Jun 16 20:55:43 localhost kernel: nvidia: module license ‘NVIDIA’ taints kernel.
Jun 16 20:55:43 localhost kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 241
Jun 16 20:55:43 localhost kernel: nvidia 0000:01:00.0: enabling device (0000 → 0003)
Jun 16 20:55:43 localhost kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
Jun 16 20:55:43 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 440.82 Wed Apr 1 20:04:33 UTC 2020
Jun 16 20:55:44 localhost kernel: nvidia 0000:01:00.0: DMAR: 32bit DMA uses non-identity mapping
Jun 16 20:55:44 localhost kernel: nvidia-smi (364) used greatest stack depth: 12968 bytes left
Jun 16 20:55:44 localhost kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 440.82 Wed Apr 1 19:41:29 UTC 2020
Jun 16 20:55:44 localhost kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Jun 16 20:55:44 localhost kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1

journalctl -b 0 -k | grep -i mtrr
Jun 16 20:55:43 localhost kernel: MTRR default type: uncachable
Jun 16 20:55:43 localhost kernel: MTRR fixed ranges enabled:
Jun 16 20:55:43 localhost kernel: MTRR variable ranges enabled:
Jun 16 20:55:43 localhost kernel: Found optimal setting for mtrr clean up

Please attach the .config of the kernel and a dmesg output.

config.txt (136.6 KB)
dmesg.txt (82.4 KB)

Here are the requested files, thanks!

May I add, I tried nvidia framebuffer support (should be visible in the config file) and without. Also, a monitor is connected to both GPUs but only the one connected to intel’s displays output.

I suspect you’ll have to play with iommu options

DMAR: 32bit DMA uses non-identity mapping

Try if disabling iommu in bios makes the nvidia gpu work, or set it on kernel command-line, either iommu=soft, iommu=off or intel_iommu=igfx_off
Other than that, I didn’t find anything obvious in config or dmesg.

Thanks! I took a look at my IOMMU settings and did not disable them but enabled passthrough by default:

# CONFIG_CALGARY_IOMMU is not set
CONFIG_IOMMU_IOVA=y
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
# Generic IOMMU Pagetable Support
# end of Generic IOMMU Pagetable Support
# CONFIG_IOMMU_DEBUGFS is not set
CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y
CONFIG_OF_IOMMU=y
# CONFIG_AMD_IOMMU is not set
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set
CONFIG_INTEL_IOMMU_DEFAULT_ON=y
CONFIG_INTEL_IOMMU_FLOPPY_WA=y

and the error:

NVRM: GPU 0000:01:00.0: Failed to copy vbios to system memory.
NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x30:0xffff:755)
NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

went away. Now CUDA works, the only thing is that there is no display output through the nvidia card yet. I don’t have any custom xorg configuration file. I tried disabling IOMMU but the problem still persists (no display output). I also tried several permutations of framebuffer support (EFI on/off, nvidia support on/off mark vgs/vbe/efi FB as generic on/off) to no avail. Of course I also switched default display device at my motherboard BIOS.

Depending on what kind of setup you want, you’ll have to create an xorg.conf.
Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post. You will have to rename the file ending to something else since the forum software doesn’t accept .gz files (nifty!).

I don’t know if it’s related or not, but I now got this error:

ucsi_ccg 6-0008: con1: failed to register alternate modes

For this error I tried enabling Type-C alternate modes for nvidia, but the error kept comming. Full report attached!

nvidia-bug-report.log.txt (65.1 KB)

I forgot to mention that when I installed gentoo with the CPU with no integrated graphics, I did not had to make a custom xorg. On that ocasion I installed GNOME but now I am on Xfce, if that makes a difference.

Please remove the xf86-video-intel driver, then check how to enable PRIME to have desktop output from both intel and nvidia gpu:
https://forums.developer.nvidia.com/t/official-driver-384-59-with-geforce-1050m-doesnt-work-on-opensuse-tumbleweed-kde/52620/2?u=generix

1 Like

Unrelated, but you should disable iommu debugfs support in kernel config, there’s a big warning in dmesg.

Thanks! In the next reboot the display on nvidia now shows. The display at intel went blank but I now know its a matter of configuration. Nvidia server setting now works.

Great help!

You’ll probably just need to run the xrandr comands and set them in the DM.

1 Like

I wanted to report that my system is now PRIME capable thanks to your debugging! There persists only a minor problem which I couldn’t resolve. The motherboard is set to default to the intel iGPU. Two monitors are connected to this GPU. During the boot process, the screens display normally the running text in mirror mode. When the display manager is going to start, both screens blank out. I noticed that the system was pushing the output to the Nvidia card because when I connected a cable to the dGPU, the display manager boot screen showed up (using LXDM as the display manager). After entering my user password, the control goes back to the iGPU (because of the PRIME configuration) and the two monitors light up. I can confirm PRIME is working because when I start a “graphics intensive” application, the performance level of the nvidia card jumps from 0 to 3, the PCIe link speed jumps from gen1 to gen3 and the GPU and PCIe bandwidth utilization also increases.

A minor hasle in the current configuration is that I need to have connected at least one of the monitors to the nvidia card in order to view the display contents when the display manager takes control. Also, as the monitor is connected to the nvidia card, some program (probably nvidia-settings) automatically detects this monitor and it pushes it into the final screen of the X server with the result of having 3 monitors (one of them that I can’t see unless I manually set the default input on the monitor that has two cables: one for the intel and the other for the nvidia GPU). A workaround I’ve used is to manually disable the monitor on my desktop environment (Xfce), which upon a reboot, the display doesn’t show up anymore in the graphical xfce session. This is probably easily fixed by additional configuration but I couldn’t find a solution yet.

I would greatly appreciate any help. Here is my xorg.conf:

xorg.conf.txt (1.9 KB)

Oh and I forgot to mention that I tested both modesetting and propietary drivers for the intel GPU. And they both work in my setup. I currently have enabled the propietary xf86-intel-drivers because with the modesetting ones I have horrible screen tearing in my firefox web browser (which could be resolved if I use the hwaccel USE flag to recompile Firefox, but I havn’t tested that)

Sorry for the triple replies. I think I found the problem and wanted to share with you. It turns out this display manager (maybe all of them?) resorts to using the xorg.conf file to display itself. So what I have to do is presumably change the runnig script for xrandr from PostLogout to PreLogin. I’ll let you know if this works out.

To mitigate tearing with the modesetting driver, use kernel parameter
nvidia-drm.modeset=1
this enables cross-gpu vsync.

1 Like

As always, you are right! I spent most of the day trying to find a solution for the switching of the displays but couldn’t solved it. It seems that when the display manager starts, all xorg.conf files are read in order and while the PRIME configuration file was visible, but not enabled xorg would fail in the current graphics provider (intel) and switched back to the nvidia card. The xrandr commands must be enabled after the xorg server has already started and hence, I found no way of enabling PRIME before the user could log in***. With my very limited hacking abilities I decided to live with having to connect the nvidia card to a display in order to bypass this hiccup in multi-monitor PRIME setup in linux.

My next objective was to reduce the screen tearing but after some testing I found out that indeed the proprietary driver doesn’t work well with rotated screens and so I placed my attention back at the modesetting driver. Because of the screen tearing I was almost deciding to throw away any chances of using PRIME and although I did see in many sites this parameter (nvidia-drm.modeset), all of them where unclear (to me) as to how to implement it. Until I stumbled upon a forum post that suggested enabling this feature in the bootloader. And it worked. Flawlessly.

I am very happy with my setup now and it’s very possible I couldn’t make it without your help. Thanks!

*** I did managed to automatically enable PRIME, but all positive results where after the user had logged in. So the splash screen in LXDM was ever shown through the nvidia card.

For simple display-managers, you can often set the xrandr commands at the beginning in /etc/X11/Session/Xsession
Otherwise, I guess lxdm’s LoginReady script should be used for it.