Installing Tesla P40 VGPU on RHEL 8.7

I have a RHEL 8.7 physical server with a Tesla P40 installed.

It shows up in lspci output:


04:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)

However, after installing the downloaded drivers from the licensing portal I can see that:

$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

and that:

$ dmesg | grep vfio
[ 3.464768] [nvidia-vgpu-vfio] Unable to get symbol for nvidia_vgpu_vfio_get_ops from nvidia.ko
[ 13.558106] [nvidia-vgpu-vfio] Unable to get symbol for nvidia_vgpu_vfio_get_ops from nvidia.ko

I have blacklisted nouveau as suggested in RedHat documentation, but the nvidia_vgpu_vfio driver module will not load.

How can troubleshoot to find the reason why it will not load? What is the correct procedure for installation? It seems like I’ve missed a step somewhere. My server supports SR-IOV and that is enabled in the BIOS. There are no other NVIDIA cards in the system–just the onboard graphics.

I used dnf to install the provided rpm from the download. I’m not sure what else to do with it. It seems others that have had this issue didn’t have nouveau blacklisted as I do.

$ cat /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
blacklist nouveau
options nouveau modeset=0

and

$ lsmod | grep nouveau
$ – no output –

But I do see it in lspci -k

$ lspci -k | grep nvidia
Kernel modules: nouveau, nvidia_vgpu_vfio, nvidia

How is this driver supposed to work?

This is the file I downloaded
NVIDIA-GRID-RHEL-8.7-525.85.07-525.85.05-528.24.zip

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

The bug log is attached: https://file.io/A0jBWdOgSzNF

The bios is not providing enough resources for the Tesla. Please enable “above 4G decoding” or “large/64bit BARs” and disable CSM in bios, then reinstall the OS in EFI mode.

Unfortunately, I don’t think this server supports that feature set. I was actually looking at that prior to my post here, but your response confirmed I’m in the ballpark. That server board is probably 13-14 years old and doesn’t support those features (I don’t believe it does). It does support NVIDIA 3 and 6 GB cards, but this is a Tesla P40–so 24GB. I guess there is more to it than what I’m writing here…if I’m wrong please let me know and I’ll give it a try.

The board is a Supermicro X8DTT-H+

I have another workstation I’m going to try this card in if there is no way to use the above. That second workstation does support that setting. I have it on now. :)

I checked the board’s manual and offered resources, no Tesla will work in it.
Remember Teslas were not built to work in regular workstation, the don’t have fans, you might need something like a fan adapter.