How to Resolve GPU Memory Mapping Issues in XenServer
CTX139834 Created onMar 26, 2014 Updated onApr 02, 2014
Article Topic : Storage, Other
See Applicable Products
Objective
This article is for customers running XenServer 6.2.0 who are using the 3D Graphics Pack (3DGP) with NVIDIA GRID GPUs, and have problems starting Virtual Machines (VMs) with a virtual GPU (vGPU) created. Customers may find that virtual machines fail to start with a message similar to the following:
Unix.Unix_error(20, "open", "/sys/bus/pci/drivers/nvidia/bind")
This can be caused by the NVIDIA driver not loading in the host’s control domain. To check this, run the following command on the host console:
lsmod | grep ^NVidia
This will return no results if the driver is not loaded.
To find out whether this is caused by the memory mapping issue, run the following command on the host console:
dmesg | grep NVIDIA
Check for messages containing:
"This PCI I/O region assigned to your NVIDIA device is invalid"
If you see this message, it confirms that the GPU has been mapped into memory inaccessible to the host’s control domain. This can be resolved with a change to the BIOS settings.
Instructions
The following sample procedure is for a Dell R720 server. For other server types, refer to the vendor documentation.
Reboot the server and enter System Setup (press F2).
Navigate to System BIOS, and then Integrated Devices.
Change Memory Mapped I/O above 4GB to Disabled.
Save the settings and reboot the host. It should now be possible to start VMs with vGPUs.
sorry for the extended delay - environment was migrated to a different datacenter and we have had other projects that took priority.
Memory Mapped I/O is and has always been disabled.
A bit of success - we updated the firmware on the hardware (C240M3) to the latest version from Cisco. Now 1 of the 2 server will boot vGPU :), the other still gives the same errors for some reason.
We’re going to re-flash the firmware and open them to be sure the hardware config is identical between the two of them, will let you know if we get it working.