VMWare 8.0 VM not starting with an A40 GPU configured as Shared Direct

When using an A40 GPU in VMWare 8.0 we get an error when starting the VM. “VM start failed Nvidia A40 vGPU Could not initialize plugin ‘/usr/lib64/vmware/plugin/libnvidia-vgx.so’ for vGPU ‘nvidia_a40-1b’” When used in PassThrough mode it shows available to the VM, but errors out when used. Per VMWare everything looks ok in the setup on VMWare. They spent over an hour checking everything. Shows valid until accessed, then the GPU acts like it shuts down. I have installed the deamon and driver. I have created a web version of the Grid licensing server for the Shared Direct. What can I do to resolve this issue?

Thanks for your help.

[root@colo-vsphere-02:~] nvidia-smi
Fri Aug 22 15:52:44 2025
±------------------------------------------------------------------------------ ----------+
| NVIDIA-SMI 580.65.05 Driver Version: 580.65.05 CUDA Version: N/A |
±----------------------------------------±-----------------------±----------- ----------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Un corr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util C ompute M. |
| | | MIG M. |
|=========================================+========================+============ ==========|
| 0 NVIDIA A40 On | 00000000:21:00.0 Off | Off |
| 0% 31C P8 33W / 300W | 0MiB / 49140MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±----------- ----------+

±------------------------------------------------------------------------------ ----------+
| Processes: |
| GPU GI CI PID Type Process name G PU Memory |
| ID ID U sage |
|=============================================================================== ==========|
| No running processes found |
±------------------------------------------------------------------------------

[root@colo-vsphere-02:~] esxcli hardware pci list | grep -i nvidia -A 20
Vendor Name: NVIDIA Corporation
Device Name: NVIDIA A40
Configured Owner: VMkernel
Current Owner: VMkernel
Vendor ID: 0x10de
Device ID: 0x2235
SubVendor ID: 0x10de
SubDevice ID: 0x145a
Device Class: 0x0302
Device Class Name: 3D controller
Programming Interface: 0x00
Revision ID: 0xa1
Interrupt Line: 0xff
IRQ: 255
Interrupt Vector: 0x00
PCI Pin: 0x00
Spawned Bus: 0x00
Flags: 0x3001
Module ID: 64
Module Name: nvidia
Chassis: 0
Physical Slot: 2
Slot Description: PCIe Slot 2
Device Layer Bus Address: s00000002.00
Passthru Capable: true
Parent Device Address: 0000:20:03.1
Dependent Device Address: 0000:21:00.0
Reset Method: Bridge reset
FPT Sharable: true
NUMA Node: 2
Hardware Label:
Virtual Function:

[root@colo-vsphere-02:~] esxcli software vib list | grep -i NV
NVD-VMware_ESXi_8.0.0_Driver 580.65.05-1OEM.800.1.0.20613240 NVD VMwareAccepted 2025-08-18 host
nvdgpumgmtdaemon 570.133.10-1OEM.700.1.0.15843807 NVD VMwareAccepted 2025-08-21 host
nvme-pcie 1.2.4.15-1vmw.803.0.0.24022510 VMW VMwareCertified 2025-05-01 host
nvmerdma 1.0.3.9-1vmw.803.0.0.24022510 VMW VMwareCertified 2025-05-01 host
nvmetcp 1.0.1.32-1vmw.803.0.70.24674464 VMW VMwareCertified 2025-05-01 host
nvmxnet3-ens 2.0.0.23-6vmw.803.0.0.24022510 VMW VMwareCertified 2025-05-01 host
nvmxnet3 2.0.0.31-12vmw.803.0.0.24022510 VMW VMwareCertified 2025-05-01 host
lsuv2-intelv2-nvme-vmd-plugin 2.7.2173-2vmw.803.0.0.24022510 VMware VMwareCertified 2025-05-01 host
lsuv2-nvme-pcie-plugin 1.0.0-1vmw.803.0.0.24022510 VMware VMwareCertified 2025-05-01 host
vmware-esx-esxcli-nvme-plugin 1.2.0.56-1vmw.803.0.0.24022510 VMware VMwareCertified 2025-05-01 host

[root@colo-vsphere-02:~] nvidia-smi -q

==============NVSMI LOG==============

Timestamp : Fri Aug 22 15:57:51 2025
Driver Version : 580.65.05
CUDA Version : Not Found
vGPU Driver Capability
Heterogenous Multi-vGPU : Supported

Attached GPUs : 1
GPU 00000000:21:00.0
Product Name : NVIDIA A40
Product Brand : NVIDIA
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : Yes
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : N/A
vGPU Device Capability
Fractional Multi-vGPU : Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Supported
Homogeneous Placements : Not Supported
MIG Time-Slicing : Not Supported
MIG Time-Slicing Mode : Disabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Enabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322122016380
GPU UUID : GPU-ebb03a45-ff15-3374-ec87-40da0a7a5239
GPU PDI : 0xb22801ca26c9f26e
Minor Number : 0
VBIOS Version : 94.02.5C.00.0F
MultiGPU Board : No
Board ID : 0x2100
Board Part Number : 900-2G133-0100-130
GPU Part Number : 2235-895-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0200.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : Host VGPU
Host VGPU Mode : SR-IOV
vGPU Heterogeneous Mode : Disabled
GPU Recovery Action : None
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x21
Device : 0x00
Domain : 0x0000
Device Id : 0x223510DE
Bus Id : 00000000:21:00.0
Sub System Id : 0x145A10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : N/A
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 100 KB/s
Rx Throughput : 50 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 0 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Clocks Event Reasons Counters
SW Power Capping : 0 us
Sync Boost : 0 us
SW Thermal Slowdown : 0 us
HW Thermal Slowdown : 0 us
HW Power Braking : 0 us
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 468 MiB
Used : 0 MiB
Free : 48673 MiB
BAR1 Memory Usage
Total : 65536 MiB
Used : 1 MiB
Free : 65535 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
DRAM Encryption Mode
Current : N/A
Pending : N/A
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Channel Repair Pending : No
TPC Repair Pending : No
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 30 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 88 C
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Average Power Draw : 33.41 W
Instantaneous Power Draw : 33.43 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Module Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Power Smoothing : N/A
Workload Power Profiles
Requested Profiles : N/A
Enforced Profiles : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : 1740 MHz
Memory : 7251 MHz
Default Applications Clocks
Graphics : 1740 MHz
Memory : 7251 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1740 MHz
SM : 1740 MHz
Memory : 7251 MHz
Video : 1530 MHz
Max Customer Boost Clocks
Graphics : 1740 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Summary : N/A
Bandwidth : N/A
Route Recovery in progress : N/A
Route Unhealthy : N/A
Access Timeout Recovery : N/A
Incorrect Configuration : N/A
Processes : None
Capabilities
EGM : disabled