Nvidia-smi says "no devices were found" on RHEL 7.9, A100 40G

Hello all,

I was trying to install GPU driver for A100 40G on VMWare, and had confirmed the GPU passthrough, since lshw -c display displays follows:

[root@localhost ~]# lshw -c display
  *-display
       description: VGA compatible controller
       product: SVGA II Adapter
       vendor: VMware
       physical id: f
       bus info: pci@0000:00:0f.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: vga_controller bus_master cap_list rom
       configuration: driver=vmwgfx latency=64
       resources: irq:16 ioport:1070(size=16) memory:e8000000-efffffff memory:fe000000-fe7fffff memory:c0400000-c0407fff
  *-display
       description: 3D controller
       product: GA100 [GRID A100 PCIe 40GB]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:13:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm bus_master cap_list
       configuration: driver=nvidia latency=248
       resources: irq:16 memory:fc000000-fcffffff memory:e4000000-e5ffffff

After entering lspci -v | grep -i nvidia, it displays:

[root@localhost ~]# lspci -v | grep -i nvidia
13:00.0 3D controller: NVIDIA Corporation GA100 [GRID A100 PCIe 40GB] (rev a1)
        Subsystem: NVIDIA Corporation Device 145f
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia

Following are some records from messages:

Aug 26 15:03:56 localhost kernel: resource sanity check: requesting [mem 0xfc700000-0xfd6fffff], which spans more than PCI Bus 0000:13 [mem 0xfc000000-0xfcffffff]
Aug 26 15:03:56 localhost kernel: caller os_map_kernel_space.part.6+0xbe/0xc0 [nvidia] mapping multiple BARs
Aug 26 15:03:56 localhost kernel: NVRM: GPU 0000:13:00.0: RmInitAdapter failed! (0x24:0xffff:1209)
Aug 26 15:03:56 localhost kernel: NVRM: GPU 0000:13:00.0: rm_init_adapter failed, device minor number 0

I first tried installing CUDA toolkit 12.6 on RHEL 8.4, the installation was rather smooth, and nvcc --version did have information display, but nvidia-smi displays “no devices were found”.
I tried blacklist nouveau, but displays the same.
Then I thought it might be the kernel version mismatch (4.18-553), so I used another VMWare instead, the RHEL 7.9, with a kernel version 3.10-1160, and installed driver only, but displays the same.
I tried installing DKMS, updating GCC, chmod 777 driver, but none worked.
I tried drivers of 12.6 and 11.4.

Followings are some logs that might be useful. What should I do now?
nvidia-bug-report.log.gz (585.1 KB)
nvidia-installer.log (279.8 KB)

Hello?