One K260Q vGPU working -> vmiop_log: error: /usr/lib/libnvidia-vgx.so

We are running some Dell R720 + latest Bios + One Grid K2 + XenServer 6.2 SP1 with all hotfixes (including latest version of XS62ESP1005. We are using the latest NVIDIA manager and drivers.

We have three VMs defined per server. One VM is mapped to the first K2 PGPU via pass-through. The two other VMs are mapped to the second K2 GPU using the K260Q profile.

We are finding that the first VM + K260Q works fine. However when booting the second VM + K260Q we are seeing some issues both within the VM (driver error) and in the XenServer logs. See below.

Any thoughts?

I am attaching the nvidia-bug-reports for two of the servers along with a screenshot of XenCenter summarizing our setup.

Apr 30 10:47:05 coe-xen502g xenstored: A89616 write /local/domain/44/platform/vgpu_config /usr/share/nvidia/vgx/grid_k260q.conf
Apr 30 10:47:08 coe-xen502g fe: vgpu-44[30274]: main: --config = ‘/usr/share/nvidia/vgx/grid_k260q.conf’
Apr 30 10:47:08 coe-xen502g fe: vgpu-44[30274]: demu_initialize: PLUGIN CONFIG: /usr/share/nvidia/vgx/grid_k260q.conf,gpu-pci-id=0000:07:00.0
Apr 30 10:47:08 coe-xen502g fe: vgpu-44[30274]: vmiop_log: notice: pluginconfig: /usr/share/nvidia/vgx/grid_k260q.conf,gpu-pci-id=0000:07:00.0
Apr 30 10:47:08 coe-xen502g fe: vgpu-44[30274]: vmiop_log: notice: Loading Plugin0: libnvidia-vgx
Apr 30 10:47:08 coe-xen502g xapi: [ info|coe-xen502g|167|xapi events D:5978e6da5b7b|xenops] xenops: VM.import_metadata {“vm”: {“id”: “1fc51a16-4d5c-b0bb-416e-97afafdb5782”, “name”: “cee-cad-appsrv2”, “ssidref”: 0, “xsdata”: {“vm-data”: “”}, “platformdata”: {“generation-id”: “”, “timeoffset”: “-14401”, “usb”: “true”, “usb_tablet”: “true”, “vgpu_pci_id”: “0000:07:00.0”, “vgpu_config”: “/usr/share/nvidia/vgx/grid_k260q.conf”, “nx”: “true”, “acpi”: “1”, “apic”: “true”, “pae”: “true”, “viridian”: “true”, “device_id”: “0002”}, “bios_strings”: {“bios-vendor”: “Xen”, “bios-version”: “”, “system-manufacturer”: “Xen”, “system-product-name”: “HVM domU”, “system-version”: “”, “system-serial-number”: “”, “hp-rombios”: “”, “oem-1”: “Xen”, “oem-2”: “MS_VM_CERT/SHA1/bdbeb6e0a816d43fa6d3fe8aaef04c2bad9d3e3d”}, “ty”: [“HVM”, {“hap”: true, “shadow_multiplier”: 1.000000, “timeoffset”: “-14401”, “video_mib”: 16, “video”: “Vgpu”, “acpi”: true, “serial”: “pty”, “keymap”: “en-us”, “pci_emulations”: , “pci_passthrough”: false, "b
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000045vgx+0x7b) [0xb24c4b2b]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000480vgx+0x622) [0xb24bfb02]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000799vgx+0x55) [0xb24bab05]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000348vgx+0x38b) [0xb24c7f5b]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so [0xb24aca4a]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000053vgx+0x8d) [0xb24c488d]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000487vgx+0x160) [0xb24bc1a0]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000799vgx+0x55) [0xb24bab05]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so(_nv000348vgx+0x38b) [0xb24c7f5b]
Apr 30 10:47:35 coe-xen502g fe: vgpu-44[30274]: vmiop_log: error: /usr/lib/libnvidia-vgx.so [0xb24aca4a]
xen501g_nvidia-bug-report.log.gz (1.27 MB)
xen502g_nvidia-bug-report.log.gz (635 KB)
GPU_overview.PNG