NVIDIA A30 no device were found HP Proliant 585 G7 centOS 7

Hi we are installing a new NVIDIA A30 in our HP Proliant 585 G7.

We can install the NVIDIA-Linux-x86_64-470.82.01.run drivers

but when we try to run the nvidia-smu we get this error message:
No devices were found

From dmesg | grep NVRM we get:

[ 9.537026] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 462.145888] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:41:00.0)
[ 462.146044] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 462.146048] NVRM: None of the NVIDIA devices were initialized.
[ 1624.354351] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 1674.449971] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 2199.542247] NVRM: GPU 0000:41:00.0: RmInitAdapter failed! (0x24:0xffff:1220)
[ 2199.542617] NVRM: GPU 0000:41:00.0: rm_init_adapter failed, device minor number 0
[ 2200.187143] NVRM: GPU 0000:41:00.0: RmInitAdapter failed! (0x24:0xffff:1220)
[ 2200.187410] NVRM: GPU 0000:41:00.0: rm_init_adapter failed, device minor number 0

and from /proc/driver/nvidia/gpus/0000:41:00.0 we get

Model: NVIDIA A30
IRQ: 96
GPU UUID: GPU-9dd77b2e-aa4e-b963-efbd-17c864b60f6d
Video BIOS: ??.??.??.??.??
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:41:00.0
Device Minor: 0
GPU Excluded: No

The nouveau driver is off

Any ideas how to solve this problem ?

Kind Regards

Please make sure that “above 4G decoding” or “large/64bit BARs” is enabled, CSM is disabled in bios and the system is booting using EFI.
If that doesn’t help, please set kernel parameter
pci=realloc

Hi in our bios we don’t have“above 4G decoding” or “large/64bit BARs options

about pci=realloc we added this line in /etc/default/grub and after we updated the
grub with the command grub2-mkconfig -o /boot/grub2/grub.cfg

In particular with pci=realloc and pci=realloc=on it is n’t possible to install the NVDIA drivers… with pci=realloc=off we are able to reinstall NVDIA driver but nothing change we go back to the starting point nvidia-smi => no devices were found

We tried also to follow step by step this guide Install Nvidia Drivers on RHEL | Kinetica Docs
but no changes always the same error

Do you have any other idea ?

Kind Regards

IIRC, the bios setting is a bit hidden on HPE servers, in “service options”
https://forums.developer.nvidia.com/t/nvrm-this-pci-i-o-region-assigned-to-your-nvidia-device-is-invalid/81645/4
Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

nvidia-bug-report.log.gz (126.6 KB)

Hi in attachment our nvidia bug report

Kind regards

Now this is some really old machine and the bad news is, IIRC, this won’t work due to the fact that this is an old non-UEFI multi-socket AMD machine and the Linux kernel has disabled 64bit resources for those due to instability. 64bit resources are needed to get the A30 to work.
https://forums.developer.nvidia.com/t/nvidia-smi-shows-no-devices-were-found/167930/4