Rocky Linux 9.2 NVIDIA-SMI has failed

Hello there.
I try to install vGPU on Rocky Linux 9.2 but i have error
I setup this driver “nvidia-linux-grid-510-510.47.03-1.x86_64”

“# nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”

lspci |grep -E “VGA|3D”

06:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Tesla P6] (rev a1)

dkms status

nvidia/510.47.03: added

dkms install nvidia/510.47.03

Sign command: /lib/modules/5.14.0-284.30.1.el9_2.x86_64/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Building module:
Cleaning build area…
‘make’ -j4 NV_EXCLUDE_BUILD_MODULES=‘’ KERNEL_UNAME=5.14.0-284.30.1.el9_2.x86_64 IGNORE_CC_MISMATCH=‘1’ modules…(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.14.0-284.30.1.el9_2.x86_64 (x86_64)
Consult /var/lib/dkms/nvidia/510.47.03/build/make.log for more information.

From hypervisor

nvidia-smi

Tue Nov 14 09:28:21 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: N/A |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P6 On | 00000000:18:00.0 Off | Off |
| N/A 29C P8 9W / 90W | 16234MiB / 16384MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3006655 C+G vgpu 16192MiB |

nvidia-bug-report.log (406.1 KB)

Any advice will help me.
Thx

The driver is too old for the rocky 9.2 kernel, it doesn’t compile. You’ll need to upgrade the vgpu install on the host.

https://docs.nvidia.com/grid/

Thx for the reply and sorry for my incompetence, but I didn’t see where it says Rocky 9.2 is not compatible with the driver version 510.47.03

Best Regards!

It doesn’t directly. The log you provided tells the 510 driver doesn’t compile on a current rocky kernel. The link I provided shows under “All Release Branches” that you’re using an outdated vGPU version 14.0 14.4 being. EOL since February.

generix ty for your answer.

Interesting things, I built new VM with Rocky 8.8. Load kernel 4.18. But i have same error.

Building module:
Cleaning build area…
‘make’ -j4 NV_EXCLUDE_BUILD_MODULES=‘’ KERNEL_UNAME=4.18.0-477.27.1.el8_8.x86_64 IGNORE_CC_MISMATCH=‘’ modules.(bad exit status: 2)
Error! Bad return status for module build on kernel: 4.18.0-477.27.1.el8_8.x86_64 (x86_64)
Consult /var/lib/dkms/nvidia/510.47.03/build/make.log for more information.

Please attach the referenced make.log.
In general, those kernels are heavily patched containing backports from current kernels. The driver 510.47 you’re using is from around January 2022 so won’t compile on any recent kernel.

generix thank you for you answer. Unfortunately i haven’t accesses to private cabinet and i cant download fresh drivers for my gpu host. Mb do you now alternative way?
Best regards
Alex

Not really. Of course, you could try patching the driver so it works on the RL9.2 kernel, which resembles a 5.19 vanilla kernel, likely using bits from the patches for the 470 driver
https://gist.github.com/joanbm/c00f9e19731d80269a4badc595f63b68
https://gist.github.com/joanbm/d630a02dde00bf087f64091d331f6dbb

Thanks a lot, I’ll try.