GPU: GRID K1
Model: Dell PowerEdge R720
System: vSphere 6.5
This purple screen after I install driver and reboot.
Earlier, I was change Server Model (Different R720) and Sytem version (6.0U2 & 6.5), but it always happened :(
How should I fix it? Thanks.
The driver you’ve listed above as being installed is the “Offline_Bundle” driver. I’m unsure what this driver is for or if it will cause you any issues (I don’t use it), but I have asked internally at Nvidia for clarification and I’ll post back when I have the information (if anyone else can answer that, please feel free to comment below).
You may want to try uninstalling that driver, rebooting the host and installing the other driver in the download: NVIDIA-kepler-VMware_ESXi_6.5_Host_Driver_367.64-1OEM.622.214.171.12440417.vib. This is the .vib that is in the main .zip file along with the windows drivers and documentation.
If you’d like to wait for clarification on what the driver differences are before changing them, then that’s fine and as said, I’ll update as soon as I hear back.
If the host won’t boot and keeps giving you the PSOD, you can try disabling the PCIe slot where the GPU is located so it can’t be detected (or just temporarily remove the GPU from the host), then when booted manually uninstall the driver, then enable the PCIe slot / re-fit the GPU and install the new driver.
Something else for you to check; is your R720 BIOS as up to date as it can be? You can check this with "Lifecycle Manager" when you start your server and check the Dell FTP site for updates for the system.
It’s same situation when I used .vib on vSphere.
Now I uninstall K1 GPU Card and export the log of vSphere.
It’s often work checking known configuration issues in our KB search: http://nvidia.custhelp.com/app/home/
A search on "PSOD GRID" shows:
Can you check your MSI config?