I am trying to get vLLM to run on the AWS EC2 g6f.large instances with RHEL 10. I tried installing both the 580 and the 590 CUDA drivers following “ Red Hat Enterprise Linux — NVIDIA Driver Installation Guide ”. However, upon rebooting both fail to load and produce a similar dmesg output:
[ 1201.204723] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 1201.205659] NVRM: None of the NVIDIA devices were initialized.
[ 1203.830436] NVRM: The NVIDIA GPU 0000:31:00.0 (PCI ID: 10de:27b8)
NVRM: installed in this system is not supported by the
NVRM: NVIDIA 580.105.08 driver release.
NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
NVRM: in this release's README, available on the operating system
NVRM: specific graphics driver download page at www.nvidia.com.
Looking at the appendix this message mentions, it looks like the 580 drivers should indeed support the L4 ( Appendix A. Supported NVIDIA GPU Products ) given that the PCI ID seems to match what’s stated in the documentation:
[root@ip-172-31-1-52 ec2-user]# lspci -nn | grep NVIDIA
31:00.0 3D controller [0302]: NVIDIA Corporation AD104GL [L4] [10de:27b8] (rev a1)
Anything I’m missing here?