NVRM: This PCI I/O region assigned to your NVIDIA device is invalid

Ok. There goes my day. I’m terrified. So many questions. I know it’s not your job to guide me through all that. But… does this mean all data will need to be moved/backed-up on another drive? Am I building the drive up from scratch? Re-installing ubuntu? Re-installing something else? Confused. :(

Yes. Backup data, then start from scratch. EFI needs a different partition schema.
The CSM of this mainboard doesn’t even provide enough resources for the mainboard to fully work in a 64bit OS. I guess it’s only provided to get some ancient 32bit Windows XP to install.

Thanks, @generix. Will do.

@generix and @aplattner, I did a fresh install. The attached log was collected right after the first reboot. dmesg.log (84.6 KB)

The board is still only providing 32bit resources. Please update bios and make sure above 4G decoding is really enabled.

Thanks. What are some of the key messages that indicate 64bit resources are/aren’t available? Ones like this?

[ 0.396029] pnp 00:05: disabling [mem 0xfed10000-0xfed17fff] because it overlaps 0000:03:00.0 BAR 1 [mem 0x00000000-0x3ffffffff 64bit pref]

It’s this:

[    0.333223] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.333224] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.333225] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.333226] pci_bus 0000:00: root bus resource [mem 0x90000000-0xdfffffff window]
[    0.333226] pci_bus 0000:00: root bus resource [mem 0xfc800000-0xfe7fffff window]
[    0.333227] pci_bus 0000:00: root bus resource [bus 00-fe]

Those are the memory windows that can be mapped to pcie devices. As you can see, all are just 32bit wide. Correct large/64bit BARs/Above 4G decoding should at least provide one window with 64bit width.

I enabled above 4G decoding (which was preventing bootup prior to my reinstall). New log file here. dmesg.log (77.1 KB)

This line implies more than 32bit, but doesn’t explicitly imply 64bit, yes?
pci_bus 0000:00: root bus resource [mem 0x4000000000-0x7fffffffff window]

Looks good now, 64bit resources enabled and Teslas are functional. Please install the driver now, should work.
Edit: on 64bit resource display, leading zeros are suppressed for readability. They just have to be longer than 32bits.

I installed CUDA and the driver and it appears that all is well. Thanks!!! Your responsiveness and expertise are MUCH appreciated! --Matt

Hi, I’m having the same problem on a DL580 G7, with the M40 24GB GPU, what can I do to avoid this error

Hi, I encounter the same problem on Gigabyte B550 Gaming X motherboard. CPU: AMD Ryzen 7 3700x. I plug the K80 into the PCIEX16 slot which is the integrated in CPU.

lspci | grep -i nvidia
03:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
0b:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
0c:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)

the dmesg log uploaded:
dmesg.log (87.9 KB)

The “above 4G decode” in BIOS was enabled. I found the problem is the same as above. This is new installed ubuntu 18.04. The same problem also on the new installed 20.04.

Is this a BIOS problem?

Hi! I’m trying to get an EXP GDC beast plus GTX 750 external GPU setup running on a lenovo ThinkPad X230, on linux, Lubuntu 18.04. When the card is active during boot and the “nvidia” profile was selected before shutdown via prime-select, the computer wouldn’t boot. When the card is not running boot is successful (I guess it just falls back to the intel HD 4000 graphics). When I connect the card I get the same dmesg log as above, just a few different numbers, it’s BAR0, PCI:0000:04.00.0 and major device number 238, but still error -1.

I was told to switch to UEFI and do a firmware upgrade, did both. The “above 4G decoding” option is found nowhere in the BIOS of the X230. I read the section about mem issues in the driver installation guide and it seems using some kernel parameters might be helpful. Additionally I was pointed at 1vyrain, which seems promising, but I would need to do a firmware downgrade in order for it to be successful.

The latter frightens me so I wanted to ask if there where other known options.

Thanks!

we have the same issue with k80

Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: H81M-S2H
Version: x.x
Serial Number: To be filled by O.E.M.
Asset Tag: To be filled by O.E.M.
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: To be filled by O.E.M.
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0
[  207.160967] NVRM: The system BIOS may have misconfigured your GPU.
[  207.160969] nvidia: probe of 0000:04:00.0 failed with error -1
[  207.160986] NVRM: The NVIDIA probe routine failed for 2 device(s).
[  207.160987] NVRM: None of the NVIDIA devices were initialized.
[  207.161110] nvidia-nvlink: Unregistered the Nvlink Core, major device number 240
[  207.220116] nvidia-nvlink: Nvlink Core is being initialized, major device number 240
[  207.221123] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:03:00.0)
[  207.221126] NVRM: The system BIOS may have misconfigured your GPU.
[  207.221132] nvidia: probe of 0000:03:00.0 failed with error -1
[  207.221153] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:04:00.0)
[  207.221155] NVRM: The system BIOS may have misconfigured your GPU.
[  207.221160] nvidia: probe of 0000:04:00.0 failed with error -1
[  207.221186] NVRM: The NVIDIA probe routine failed for 2 device(s).
[  207.221191] NVRM: None of the NVIDIA devices were initialized.
[  207.221484] nvidia-nvlink: Unregistered the Nvlink Core, major device number 240

Same with me, Tesla M40. Ubuntu 18.04.5, kernel 5.4.0-generic, with BIOS “above 5G” option enable. CUDA installed, but only detected my Quadro K420.
my mother-board is MSI Z490.

[ 3.697002] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:01:00.0)
[ 3.697003] NVRM: The system BIOS may have misconfigured your GPU.
[ 3.796660] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 3.796661] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 440.33.01 Wed Nov 13 00:00:22 UTC 2019

some motherboards[desktop series might just do not support these models of graphhic cards] or could only have 32 bit addressing on PCIe interface, as it seems to me.
In my case Gigabyte said that desktop motherboard " doesn’t support tesla, but workstation motherboards do"

1 Like

Thanks for the information sharing.

I try to use a Tesla K80 on a Asus Sabertooth 990FX R1 mainboard using Kubuntu. It does not work due to the mentioned This PCI I/O region assigned to your NVIDIA device is invalid error. nvidia-smi says “no devices found”. BIOS does not to provide any large BAR / 64 bit BAR / above 4G option.

The mainboard supports PCIe 2.0 but I could not find information about address widths. The mentioned root bus resource addresses are 32 bit maximum, which implies the mainboard doesn’t support 64 bit BAR.

The strange thing is: I have successfully installed another K80 with that mainboard before (i.e. device was found and nvidia-smi printed correct information). I have not changed any BIOS settings and I am using the same driver (version 460) on the same OS (Kubuntu 20.04).

Is there a way to know if a PCIe slot supports 64 Bit addressing when mainboard specifications do not contain that fact and when no BIOS option is available? Any idea why one K80 is found while another one fails? Are there K80s that support 32 Bit?

Those kind of Teslas actually exist, the only noticeable difference is that nvidia-smi shows “Display Mode: Enabled” so they behave more like normal graphics cards and only claim 256MB BAR1 space.
Don’t know how to put them into that mode, likely by flashing a different VBIOS.