M5000 GPU Pass-through to VM

I am having an issue with getting a VM to load and display a graphics package.
The VM is running RHEL 7 VM that is using PCI Pass-through

I created a Linux RHEL x64 VM.
I was able to activate the M5000 GPU for the VM via PCI.

When the VM boots, a modified XORG RHEL OS is loaded via a PXE menu.

I can see the OS load as there are several messages being displayed on the VM screen.
It continues and then it stops at the message:
“Started OpenSSH server daemon”

Looking at the Xorg.log, I see messages such as:

NVIDIA GLX Module 390.25 …
Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
NVIDIA Unified Driver for all Supported NVIDIA GPUs

Using VT number 1

No device detected

Fatal Server Error

No screens found

I am not sure what this specialized OS runs, but it loaded on physical server and works.

Any ideas on what i need to add or modify to get the OS to load and run the graphics to the VM screen?

thanks,
i will run these and report later…

i did create a new RHEL 7 VM and loaded the PXE image,…same info below…

The xorg file shows

[ 19.153] (II) Module glx: vendor="NVIDIA Corporation"
[ 19.154] compiled for 4.0.2, module version = 1.0.0
[ 19.154] Module class: X.Org Server Extension
[ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018
[ 19.154] (II) LoadModule: "nvidia"
[ 19.154] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[ 19.154] (II) Module nvidia: vendor="NVIDIA Corporation"
[ 19.154] compiled for 4.0.2, module version = 1.0.0
[ 19.154] Module class: X.Org Video Driver
[ 19.154] (II) NVIDIA dlloader X Driver 352.93 Wed Jan 24 18:57:05 PST 2018
[ 19.154] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 19.154] (++) using VT number 1

[ 19.157] (EE) No devices detected.
[ 19.157] (EE)
Fatal server error:

If I issue a nvdia-msi, I can see the driver is loaded and working the card…

Mon Jun 18 15:56:53 2018
±-----------------------------------------------------+
| NVIDIA-SMI 390.25 Driver Version: 390.25 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro M5000 Off | 00000000:0B:00.0 Off | N/A |
| 0% 34C P0 46W / 150W | 0MiB / 8126MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

thanks

Are you saying place "PCI:11:00:0" for BusID in the Xorg.conf file, as in below?

What exactly does 11:00:0 represent?

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GRID M60-4Q"
BusID "PCI:11:00:0"
EndSection

thanks,i will look into this next week and respond when i find out something

Update,

I modified the Xorg.conf, adding BusID as in below

  1. Section "Device"
  2.  Identifier     "Device0"
    
  3.  Driver         "nvidia"
    
  4.  VendorName     "NVIDIA Corporation"
    
  5.  BusID          "PCI:11:0:0"
    
  6. EndSection

I restarted the VM, but it still hangs and I get a different error…

[20.913] (–) NVIDIA(GPU-0)
[20.913] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0.
[20.913] (EE) NVIDIA(0): Set AllowEmptyInitialConfiguration if you want the server
[20.913] (EE) NVIDIA(0): to start anyway
[20.913] (EE) NVIDIA(0): Failing initialization of X screen 0
[20.913] (II) UnloadModule: "nvidia"
[20.913] (II) UnloadSubModule: "wfb"
[20.913] (II) UnloadSubModule: "fb"
[20.913] (EE) Screen(s) found, but none have a usable configuration.
[20.913] (EE)
Fatal server error:
[20.913] (EE) no screens found(EE)
[20.913] (EE)
Please consult the The X.Org Foundation support at http://wiki.x.org for help.

i typed the log file and may have transposed the incorrect driver and date
Below is from the old Xorg.

[ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018

[ 19.154] (II) NVIDIA dlloader X Driver 390.25 Wed Jan 24 18:57:05 PST 2018

[ 19.154] (++) using VT number 1

[ 19.157] (EE) No devices detected.
[ 19.157] (EE)
Fatal server error

I did the commands,

### device visible:

lspci -vvv -d 10de:*

---- no match

### driver loaded:

lsmod | grep nvidia

---- comes back with a list of nvidia files (nvidia-drm, nvidia-modset,nvidis,etc)

### no errors in driver:

dmesg | egrep ‘nvidia|NVRM’

---- comes back with a listing showing irq info, Allocated GPU, Freed GPU (nvidia 0000:0b:00:0 irq 69 for MSI/MI-X, nvidia-modeset: Freed GPU:0 (GPU-25aa9821-…………………)

### driver responding to basic command (try two times):

nvidia-smi

---- comes back with a page shwong info for the M5000 both times

i will try the commands

not an option to see and buy a K2,we have to prove first the GPU will work in a VMware VM on ESXi 6.7 with pass-through