Hi we are installing a new NVIDIA A30 in our HP Proliant 585 G7.
We can install the NVIDIA-Linux-x86_64-470.82.01.run drivers
but when we try to run the nvidia-smu we get this error message:
No devices were found
From dmesg | grep NVRM we get:
[ 9.537026] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 462.145888] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:41:00.0)
[ 462.146044] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 462.146048] NVRM: None of the NVIDIA devices were initialized.
[ 1624.354351] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 1674.449971] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.82.01 Wed Oct 27 21:21:55 UTC 2021
[ 2199.542247] NVRM: GPU 0000:41:00.0: RmInitAdapter failed! (0x24:0xffff:1220)
[ 2199.542617] NVRM: GPU 0000:41:00.0: rm_init_adapter failed, device minor number 0
[ 2200.187143] NVRM: GPU 0000:41:00.0: RmInitAdapter failed! (0x24:0xffff:1220)
[ 2200.187410] NVRM: GPU 0000:41:00.0: rm_init_adapter failed, device minor number 0
and from /proc/driver/nvidia/gpus/0000:41:00.0 we get
Model: NVIDIA A30
GPU UUID: GPU-9dd77b2e-aa4e-b963-efbd-17c864b60f6d
Video BIOS: ??.??.??.??.??
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:41:00.0
Device Minor: 0
GPU Excluded: No
The nouveau driver is off
Any ideas how to solve this problem ?
Please make sure that “above 4G decoding” or “large/64bit BARs” is enabled, CSM is disabled in bios and the system is booting using EFI.
If that doesn’t help, please set kernel parameter
Hi in our bios we don’t have“above 4G decoding” or “large/64bit BARs options
about pci=realloc we added this line in /etc/default/grub and after we updated the
grub with the command grub2-mkconfig -o /boot/grub2/grub.cfg
In particular with pci=realloc and pci=realloc=on it is n’t possible to install the NVDIA drivers… with pci=realloc=off we are able to reinstall NVDIA driver but nothing change we go back to the starting point nvidia-smi => no devices were found
We tried also to follow step by step this guide Install Nvidia Drivers on RHEL | Kinetica Docs
but no changes always the same error
Do you have any other idea ?
IIRC, the bios setting is a bit hidden on HPE servers, in “service options”
Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.
nvidia-bug-report.log.gz (126.6 KB)
Hi in attachment our nvidia bug report
Now this is some really old machine and the bad news is, IIRC, this won’t work due to the fact that this is an old non-UEFI multi-socket AMD machine and the Linux kernel has disabled 64bit resources for those due to instability. 64bit resources are needed to get the A30 to work.