MMIO above 4 GB, ESXi 6.0u1, vGPU

I’m using a single GRID K1 card in a PowerEdge R730 that was working great until I upgraded the BIOS and some other firmware. Now, I get a warning on boot that says "Unable to allocate MMIO resources for one of more PCIe devices because of insufficient MMIO memory". I can enable the MMIO above 4 GB setting and the error goes away, but VMware ESXi doesn’t support this setting. I’ve left it disabled, hit F1 to bypass the message, and the GRID card still appears (it’s the only PCI device too). My question is, should I be worried about stability since the warning claims that the card is trying to map memory above the 4 GB limit? With the setting disabled, will the GRID card still attempt to map memory above the 4 GB barrier? My concern is that the GRID card initially uses memory below the 4 GB barrier and eventually could pass it resulting in a PSOD or some other guest VM or host crash. I’ve had previous stability issues with a GRID K1 and a PowerEdge R720 (long thread here: https://communities.vmware.com/thread/488038) and want to make sure that this setup on 13th gen PowerEdge equipment is stable.

John,
I’m not a current ESXi user, but as I understand it, the limitation was based on the underlying hypervisor not being able to support memory mapping above 4 GB for the XenServer versions that were 32-bit (so prior to XenServer 6.5). If ESXi is based on a 64-bit OS, you should be fine with the >4 GB memory mapping unless stated otehrwise. If not, you need to stay below the 4 GB boundary. Did you check the NVIDIA documentation to see what its installation setting recommendations are? WHere does it state that ESXi doesn’t support the higher memory mapping setting? You must have a fair number of devices in there to trigger even needing the higher memory mapping – I never saw this with XenServer 6.2 (32-bit), at least.
The bottom line is that as lomg as it works and is stable ot should be OK, but there should be specific documentation from both VMware and NVIDIA that state what the settings should be. If there is still an issue, you might need to open a support case.

I opened cases a little while back with both Dell and VMware. Dell recommends that the setting be enabled while VMware says the exact opposite and has a KB here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2050443. I believe NVIDIA recommends that the setting be enabled. The strange thing is that I was able to get both PCI-e devices to work with the setting disabled initially. After I upgraded the BIOS, I get the warning message on boot, even if I disable all PCI-e slots except for Slot 6 where the GRID card is. I’m not sure why the error would appear when the hardware configuration has not changed.

Hi, John: The fact that the two vendors contradict each other makes me also think it may be a bug. Any 64-bit OS should be able to support above the 4 GB mark providing the BIOS isn’t a limiting factor. Are you running the latest BIOS level on your server?

In ESXi you have to have MMIO set to below 4G. The VMware article is correct. Although ESXi is a 64bit hypervisor it still has this restriction.

If there is only a single GRID card in the host and ESXi is failing to boot when MMIO below 4G is set, then there is an issue with the server somewhere. Ensure you’ve got all other settings configured for performance.

Thanks for that clarification, Jason!

We’re using XenServer, but had a similar experience with the R730 boxes. The R720 with 2 K1 cards would boot just fine with the bios option set to MMIO <4gb. The R730 box didn’t like that combination. It wasn’t resolved until Citrix released XS 6.5 and then i could check box to enable MMIO >4gb and it starts just fine. IMO the R730 is just a little bit different that the R720 box.

With XS 6.5 both the R720 and R730 boxes are running great. Note: I’m on XD 7.6.

I finally got a chance to check the power settings (we only have one host and it’s used 24/7). It was already set to Performance. I even tried Custom and made sure that all C-states were disabled and did a cold boot. The error still appears so for now, no real graphics for our VDI users. I opened a case with VMware to see if MMIO above 4 GB is still not supported because the KB that I referenced is no longer accessible for some reason. Any other suggestions are appreciated…