Tesla M10 + Esxi 6.5 + linux guest (centOS 7)

Hello there again,
we are testing our dell R730+M10 with an ESXi environment. The idea is to have a classroom for about 30 users that could use WindowsRDSHguest and LinuxGuest. With a windows guest everything seems fine with vDGA and vSGA. Now the problem is the linux guest.
We have followed many guides/tutorials included posts in this page, finally we got a X server running with no complains, but in vGPU/vDGA mode what we get connecting by remote to the guest is only "Oh no! Something has gone wrong." bad face. The console is blank but ok, seems a limitation of the vmware. After applying suggestions from another post at HP: https://h20566.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=7307066&docLocale=en_US&docId=emr_na-c05033719 we managed to connect to the guest but at cost of not using openGL.
so my questions now:

  • the use linux guest with this grid card(s) is meant only for Horizon/Citrix environment? I though we could connect with standard protocol too
  • using vSGA instead of vGPU/vDGA would benefit of the M10 features?
  • what if I use a linux host with the GRID drivers installed?
  • we forget to try but would the Win2016+hyperV permit the passthrough to a linux guest too (and consequently to be able to use the GPU features of the M10 from inside linux) ? That could be the winning choice if we cannot afford the vmware licensing costs.

thanks for any help
a.

Hi Alaxa

Right … That’s a lot of varied information and questions

Let’s sort some of this out …

Linux -

Sounds like you’ve followed too many guides all with differing information and got yourself in a mess. Let me try and help with that … I’ll send you a PM with a URL to download a really basic text file with step by step (copy and paste) instructions of how I do it on my Linux 6.x VMs, for 7 you may need to alter a couple of links. I’ve tried to post the steps on here, but there’s something in the Forums Security that stops me from doing so. No idea what it is, I’ll raise it with the Site Admins when I get time as it’s not a script, so there shouldn’t be any issues, but the text is restricted for some reason.

Once you receive my steps, for simplicity, I’d trash what CentOS VMs you currently have and start with a clean CentOS install as it doesn’t take long. As said, it’s all step by step (no scripts, so you’re in full control of what’s happening and it’s really obvious and basic anyway) and you can simply copy / paste the commands where required. I’m going to assume you have some basic Linux skills (that’s all I have) so you will understand what to do when I send it.

Once you’ve completed the install, you’ll be able to connect to the Linux VM using standard Windows RDP (MSTSC). If you want to use any other protocol (HDX, Blast etc etc), then you’ll have to install the appropriate client / agents from that vendor.

Your questions:

Q - the use linux guest with this grid card(s) is meant only for Horizon/Citrix environment? I though we could connect with standard protocol too

A - You can use RDP as your connection protocol, but it won’t be anywhere near as good as HDX, Blast or others (there’s a reason people use XenDesktop / Horizon for graphics based virtualization …)

Q - using vSGA instead of vGPU/vDGA would benefit of the M10 features?

A - No. Using vGPU will give you the maximum amount of features and vGPU Profile choice. vDGA (Passthrough) will "pass through" the GPU to the OS. This is a 1-1 mapping. You will still need to use the NVIDIA drivers, not VMware drivers. vSGA is the least performing one you can use (bit of a waste of time (and M10) to be honest).

Q - what if I use a linux host with the GRID drivers installed?

A - I don’t understand the question. This is what you’re already trying to do? Regardless, the steps in my text document will install the GRID software onto your Linux VM.

Q - we forget to try but would the Win2016+hyperV permit the passthrough to a linux guest too (and consequently to be able to use the GPU features of the M10 from inside linux) ? That could be the winning choice if we cannot afford the vmware licensing costs.

A - If you’re struggling with VMware licensing, why not use Citrix XenServer? XenServer is licensed per socket and you don’t need a vCenter equivalent, so it’s much cheaper than the VMware alternative. Or, you can purchase XenDesktop licenses, and by doing so, you get XenServer licenses included (including the ability to run vGPU): NVIDIA vGPU for XenServer is premium Citrix-licensed feature | NVIDIA That way, you also get to use one of the best connection protocols available (rather than RDP) as well as get a cheaper (but still very good) hypervisor included in the price. However, for a single server, a XenDesktop install would be huge overkill.

That said, if you’re only using vDGA or vSGA then you don’t need to use VMware Enterprise Plus licensing or even vCenter as you can Passthrough PCIe without either. You only need Enterprise Plus licensing if you’re using vGPU, which currently, you aren’t: NVIDIA vGPU for vSPhere/ESXi is a premium VMware-licensed feature | NVIDIA

Personally, I would not use Hyper-V. As said, most people use Citrix (XenServer) or VMware (ESXi). If you have issues, there will be a lot more support available for them.

Hope that helps point you in the right direction

Regards

Ben

PM Sent …

Wow, thank you for the replay and the stepByste instructions via TXT, both are very exhaustive !
Yes, I know, many questions and many site from which I got insiprations after failing with the basic Nvidia’s guide included with the latest 4.2 GRID drivers.
Going back to you TXT stepBYstep I see that you are suggesting me what I’ve almost already done, except to use xrdp, which I would never thought and you keep the nouveau driver running. Anyhow I followed everything step by step, with centOS 7.3 and centOS 6.9 (which I believe your TXT was for) but I don’t get a working graphical environment, or better, I get it via xrdp, NX,VNC or SSH+X, but each time I run nvidia-settings it complains for a missing NVIDIA drivers. I know they are loaded because Xorg.0.log says that but also any ‘glxinfo’ or ‘glxgears’ fail.
a part from your TXT steps I tried to remove nouveau and add an xorg.conf with BusID for my Nvidia vGPU but I get the same. Ideas? does your linux guest work?

Instead, reading further more on the net I discovered that this gnome3 has some troubles with remote connections and that might be the reason I got the "Oh, no something wrong…" message in many situations. However I chose centOS 6.8 with gnome2 given that, despite the filename, the drivers included in GRID 4.2 seem to run flawlessly.

going to your considerations:

  • we like the idea to have an ESXi with a license simpe as possible that will allow us to run, upon needs, a win2016 RDSH or a linux guest. Both with passtrough or vGPU, we will see prices, for sure if we have a working solution with ESXi we would need more than "up to 8 vCPU" per VM which is one limit of free ESXi
  • linux host: because we are in a rush and need a powerful linux environment for the next two weeks and this ESXi with Linux guest (VM) seems not to work, our idea was to use a linux-baremetal-centoOS using the NVIDIA drivers from inside GRID 4.2 (trashing nouveau) . So the qeustion: would the card work with all its features in that configuration?

No worries, hopefully some of it is helpful. Yes for 7.x you may need to alter some of the download paths as they do relate to 6.x, but that’s easy enough to do.

You only need VMware Enterprise Plus licensing if you’re running vGPU, if you want to run Passthrough then you can do that with "Standard" licensing, still without vCenter as it’s just PCIe, no fancy software involved.

As you’re in a rush, unless you specifically need a 7.x version, why not just use 6.x as you know it works? Then you won’t have to reinstall and configure the hypervisor when you want to try a different Operating System.

Using an M10 in bare metal will technically work and you can use the GRID drivers. You won’t be able to use all 4 GPUs unless you have some appropriate software, so don’t expect any huge performance boost. But yes, it will technically work.

Regards

Ben

yes, in the end we will go for a CentOS6.9 which we know better (ahem…) and which comes with gnome2 and works with old fancy protocols such as NX/VNC/x2go etc…
BUT, can you tell me if your stepbyStep instructions give you a linux VM with a graphical environment where you can run the nvidia-settings ? with my tests it complains with a "You do not appear to be using the NVIDIA X driver" message.
thanks,
a.

I’m yet to try any 7.x Linux VM, and I don’t know the specific differences between a 6.x and 7.x Linux OS as I typically don’t use Linux.

If you’re having problems with 7.x running through those steps, then I would assume not as they are perfect for 6.x, but as said, I haven’t personally tried it.

Regards