Error when installing nvidia driver - Tesla K40m on Linux RHEL

Same error as with .130, like said, kernel problem.
I checked the previous threads with this bug, you could try the kernel parameter
pci=realloc=off
Otherwise, this seems to have started with updating the kernel from rhel 7.5->7.6,
3.10.0-862.14.4.el7.x86_64 was working,
3.10.0-957.5.1.el7.x86_64 not working depending on bios.

Thanks, tried 3.10.0-862.14.4.el7.x86_64 with and without pci=realloc=off, didn’t work.
Tried 3.10.0-957.5.1.el7.x86_64 with and without pci=realloc=off, didn’t work.
Is my only other option downgrading the BIOS? Anything else I can try?

Nothing I can think of. Do you have a dmesg output from 3.10.0-862.14.4.el7.x86_64?

Thanks, sure the dmesg output from 3.10.0-862.14.4.el7.x86_64 is attached.
dmesg.log (163 KB)

Still same error.
Coming to think of it, I also wouldn’t know how 8 Teslas should ever fit into a 32bit address-space. So there’s probably no way around either downgrading the bios or contact HPE if a newer bios is available with 64bit pci resources enabled again.

Hi Generix - I downgraded the BIOS and the NVIDIA driver still won’t successfully install. I have attached the logs and bug report.

nvidia-installer.log (4.56 KB)
nvidia-bug-report.log.gz (99.9 KB)
nvidia-uninstall.log (1.15 KB)

Ok, I checked the manuals of your hardware and the switch for 32bit/64bit BAR addresses is actually not a bios-controlled softswitch but actually a hardware-switch on the system board:
[url]http://h20628.www2.hp.com/km-ext/kmcsdirect/emr_na-c03930500-2.pdf[/url]

Thanks generix! Enabling 64 bit BAR addresses in the BIOS resolved the issue. The NVIDIA driver is now successfully installed and working.

For those who run into this same issue here is what I did to enable 64bit BAR addresses on the ProLiant SL270s Gen8 SE server.

  1. Boot the server and, when prompted, press F9 to enter the ROM-Based Setup Utility (RBSU).
  2. While in RBSU, press Ctrl + A. The Service Options will appear at the bottom of the list.
  3. Arrow down to Service Options and press Enter.
  4. Arrow down to PCI Express 64-bit BAR Support and press Enter.
  5. Select Enabled and press Enter.
  6. Exit out of the utility and reboot the server. Large BAR is now enabled.

    Note: The Large BAR function will always be enabled when the System Maintenance Switch 9 (hardware-switch on
    the system board) is set to the ON position. Setting the System Maintenance Switch 9 to the OFF position
    allows the ability to disable/enable Large BAR in the RBSU.