nvidia-xconfig output doesn't work for vGPU

[Related to my other recent post but I’m posting separately to hopefully make the issue clear and easy to respond to.]

With ESXi 6 U2, regardless of vGPU profile, when installing the 361.45.09 grid driver on Linux (e.g. RHEL 6) the driver installer asks if one wants to create xorg.conf. If you say "yes" the resulting xorg.conf does not work, i.e. startx gives "(EE) No devices detected". This is the same as running ‘nvidia-xconfig’ manually without arguments and gives a device section like:

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

If one explicitly specifies the vGPU device’s busid (e.g. nvidia-xconfig --busid=PCI:2:0:0 then X.org starts successfully. The device section looks like:

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:2:0:0"
EndSection

(I note that when e.g. Horizon View is installed it completely rewrites xorg.conf and adds an explicit BusID.)

Question: Is it expected that nvidia-xconfig generates a non-working xorg.conf? If not, is there a plan to eventually fix it?

Thanks.

[edit: attached logs]
nvidia-bug-report.log.gz (64.8 KB)
vmware.log.gz (47.7 KB)

Hi Nathan,

Thanks for highlighting this - I’ll pass it on to the engineering team.

Best wishes,
Rachel

Further note, Citrix Xen Server 6.5 SP1 with CentOS 7.1 guest and 367.43 drivers on hypervisor and guest does not require BusID; nvidia-xconfig produces a working xorg.conf out of the box.

Further note: Bare metal on RHEL 6.6 with 352.99 drivers (latest Tesla, non-Grid driver) does require BusID.

Hi Nathan,

I’ve spoken to a helpful engineer… Can you upload a log? The engineer said:

Can we get an nvidia-bug-report.log from this configuration? If the virtualization environment is exposing another GPU, it’s likely that X is simply trying to use that GPU with the NVIDIA driver, which won’t work, hence the need for an explicit BusID. There are many configurations where an explicit BusID may be needed; nvidia-xconfig by default won’t assign one, since it makes the resulting xorg.conf more flexible.

He also added: FWIW X won’t find non-VGA devices, lot of Tesla/GRID cards are 3D Controllers, eg Tesla M4:

lspci | grep NV

01:00.0 3D controller: NVIDIA Corporation Device 1431 (rev a1)

nvidia-xconfig could be smart enough to add BusID if none of the NV gpus are VGA devices… presumably if you’re using nvidia-xconfig to generate a xorg.conf you intend to use an nvidia gpu. But with multiple screens/gpus quickly its impossible to know what would be right thing to do.

could you run lspci on your card so we can see what it is…

It maybe this is by desing and not one we can "fix" but better doc / instruction might be necessary.

Rachel

Attached logs to the first message in this thread. These are from the VMWare vGPU scenario, not the bare metal. Engineer is welcome to ping nkidd @ opentext . com if wanting answers in a more timely fashion.