CUDA on Dell Server with Virtualization

Well well welll… The link below tells it all. [url=“Hypervisor - Wikipedia”]http://en.wikipedia.org/wiki/Hypervisor[/url]

What we have been talking about is a Level 2 hypervisor. We should start thinking on adding support to Level 1 Hyerpvisor like “VMWare ESX Server” – which anyway is cattererd to the enterprise market.

May b, they support it seamlessly. We just need to investigate if we can use VMWARE ESX server or XEN over top of a GPU box and allocate each GPU to a guest OS and finish it off…

Appreciate an answer from NVIDIA gurus…

You don’t need to run on baremetal. You’d only need to replace the OS’s PCI bus driver. (Ie, the driver for the PCI bus itself, and that could hide devices from the host OS and give them to guests.)

That involves having privilege level for the emulator itself and an API between the emulator and the host-driver to handle interrupt/DMA – which itself is totally against the conventional design of an OS.

It is better to stick with Type-1 Hypervisors like XEN or VMWARE ESX. I think CUDA would already work on the top of these hypervisors. I hope some1 from NVIDIA checks this out and update us. Type-1 hypervisors are already enterprise level and that would only further help the cause of CUDA.

No they do not. They present the guest Os’s with a certain type of hardware that is in general not the same as the underlying hardware. I will try to check today (am in the process of deploying an ESX server at this time), but e.g. I can choose between 2 different SCSI controllers to be available to the guest OS. Those controllers are not in the box itself.

I THINK ESX also has the facility of creating virtual devices for a physical device and then presenting them to the Guest OS. This helps in sharing an existing hard-disk among guest OSes. (Multiple partitions in the hard-disk get to be seen as individual SCSI disks by the guest OS).

But ESX MAY also have an option of presenting physical hardware directly to guest OSes!

At least, I have seen and configured such things on IBM servers. I hope XEN and the ESX are close enough.

Do keep us posted. It would be good to know how the ESX works. Thanks.

PS:

  1. Note that presenting physical hardware directly to guest OS (still master-controlled by hypervisor) has a security concern. The guest OS could write a bad-DMA address and could affect devices assigned to other guest OSes. One actually needs hardware support for to detect and stop bad behaviour. There r intelligent PCI bridges that can take care of all these. But I am not sure if ESX runs on such platforms (well it runs… Read below)

  2. Wiki on x86 virtualization

    Check out the heading “Virtualization Technology for Directed I/O” – This talks exactly about what we are talking here. VT-d technology is what we are looking for !

  3. Intel VT-x technology and similar technology from AMD have hardware support for CPU virtualization. Make sure you run your VMWare ESX on these cores. The following intel cores have VT-x support (courtesy: wikipedia)

Pentium 4 662 and 672

Pentium Extreme Edition 955 and 965 (not Pentium 4 Extreme Edition with HT)

Pentium D 920-960 except 925, 935, 945

Core Solo U1000 series (not T1000 series)

Core Duo T2300, T2400, T2500, T2600, T2700 only, plus L2000 and U2000 series

Core 2 Solo (all versions)

Core 2 Duo all except E8190, E7xxx, E4xxx, T5200-T5550

Core 2 Quad all except Q8200

Core 2 Extreme Duo and Quad (all versions)

Xeon 3000 series

Xeon 5000 series

Xeon 7000 series

Run your ESX on VT-x coupled with VT-d technology. That should help present the CUDA device directly to any guest OS.

  1. VMDirectPATH is our answer. This allows guest OSes directly touch hardware using VT-d technology. Recently INtel demoed with VMWare showing Guest OS accessing physical NIC directly. Intel VMDirectPath Demo @ IDF

I hope this technology would just work with any hardware not just NIC. It would be good if some1 can try it with CUDA :-)

  1. Well, Intel is going to support VMDirectPATH with their Nehalem architecture! This is cool! But VMDirectPATH keeps talking only about NICs… Hmm… That should actually be applicable for any hardware. If that happens, CUDA would get onto VM machines with no extra effort from NVIDIA. Check this Intel’s presentation out… Slides 27,28,29 are about VMDirectPath. THey talk about Networking and Storage. If NVIDIA is interested in getting on top of VM, they need to talk to VMWare right now to include support GPUs in VMDirectPATH!

Actually VMDirectPATH itself means that the VMM need not provide virtual drivers for physical hardware. Hopefully, it would encompass a broad spectrum of hardware.

Is there any1 from NVIDIA who would like to comment on this?

  1. XEN too supports VT-D. There have been discussions about support for PCI-E graphics card via VT-D. There have been bugs filed with XEN under this. But my company’s internet block is blocking XEN’s home page. Waiting to lift that ban… May b, we can get CUDA working with XEN.

Thanks for doing the sleuthing. VT-d seems to be exactly what we were talking about:

So yeah, NVIDIA should contact VMware and make sure their GPUs will work on VMDirectPath. There should be nothing stopping them, but you know how corner cases are.

I still have a hunch that all this can be done without the VT-d hardware support even if it’s a stability/security risk, but whatever. I’m glad someone’s working on it.

EDIT: looks like AMD has something similar:

What I’m confused about is what chips/chipsets support this stuff? Are they out already? I think they are, I don’t think this is a Nehalem technology.

Thanks. Not just VMWare, XEN might also work. VT-D support is there in XEN. Not sure about graphics cards… But it is just a question of enabling support one by one… So, Probably they already have support. Some1 just needs to check.

They are already available. I see people having VT-D hardware and raising questions on XEN forums. We just need to do some extra investigations… If we find something, let us post it here.

Denis, Have you got any news for us from the VMware-ESX stuff that you were doing?

Best Regards,

Sarnath

Check this thread on the Intel forum

The discussion above will help to know which Intel chipsets support VT-D.

The following post is especially useful.

Here is a cut n paste:

Here is the list of production platforms that has VT-d on it.

Stoakley (Seaburg) chipset based platforms that has VT-d1 on it:

HP XW8600:

[url=“http://h10010.www1.hp.com/wwpc/us/en/sm/WF25a/12454-12454-296719-307907-296721-3432827.html”]http://h10010.www1.hp.com/wwpc/us/en/sm/WF...21-3432827.html[/url]

Dell T5400

http://www.dell.com/content/products/produ…~tab=bundlestab

SuperMicro based on 5400:

[url=“Super Micro Computer, Inc. - Products | Motherboards | Xeon Boards | X7DWN+”]Supermicro SuperBlades, uGPU, AI System, Multi-Node Servers

[url=“Super Micro Computer, Inc. - Products | Motherboards | Xeon Boards | X7DWN+”]Supermicro SuperBlades, uGPU, AI System, Multi-Node Servers

Weybridge platforms:

For Weybridge, VT-d is productized by Intel only on vPro branded client platforms. Below is a list of such platforms, supporting VT-d, offered by HP, Dell, Acer, Intel board, Lenovo etc. I don’t believe SuperMicro carries vPro branded boxes, so we don’t know if the SuperMicro BIOS enables VT-d properly on their Weybridge offerings.

  • HP Compaq DC7800 series of desktop computers

[url=“http://h10010.www1.hp.com/wwpc/us/en/en/WF04a/12454-12454-64287-321860-3328898.html”]http://h10010.www1.hp.com/wwpc/us/en/en/WF...60-3328898.html[/url]

  • Dell Optiplex 755 desktop computers

http://www.dell.com/content/products/categ…=555&l=en&s=biz

  • ACER vPro system Veriton T661/M661/S661

[url=“Acer United States | Laptops, Desktops, Chromebooks, Monitors & Projectors”]Acer United States | Laptops, Desktops, Chromebooks, Monitors & Projectors

  • Intel board with Intel® Q35 chipset (supporting Intel® VT-d)

http://www.intel.com/products/desktop/moth…=desk_nav+board

  • Lenovo ThinkCentre M57p series desktop computers

[url=“Lenovo Official US Site | Laptops, PCs, Tablets & Data Center | Lenovo US”]http://shop.lenovo.com/SEUILibrary/control...A6F7ThinkCentre[/url]

Kindly go through the discussion. We might find more. There are info about future chipsets as well.

BEWARE that these are not official statements. These are statements from Intel employees… So we might need to do some mucking around to find what exactly is supported…

Also there is a talk about Intel VT-D2 technology and I also see that XEN 3.2 started supporting VT-D (early 2008).

Hmm, been really busy installing and migrating servers ;) I have access to the virtual center since today which has some extra options, so I’ll check there also. I saw some options regarding hardware support for visualization, but did not really notice VMDirectpath. Then again, at my work the policy is to use version N-1 if N is not needed for a bugfix, so it might be that the version we are currently using is not supporting this stuff yet.

I’ll also try to ask our VMWare expert, because the machine I have is overkill for what we need, so when there would be CUDA support I would attach a S1070 to it for sure, as I will have around 4 cores and 16 Gb of memory left over…

Hmm, chipset does not support VT-D, so i’ll need to run a bare-metal server. Ah well, might as well fill the rack up a bit more ;)

Thanks for the update. VMDirectPath will work only with VT-D hardware and looks like they are targetting Network and Storage. This would be the most likely answer you would get from your VMWare expert.

XEN on VT-D with graphics cards looks appealing. If some1 tries that out – do post your results here. Thanks.

Thanks to all you guys for your time! It was very interesting to discuss about CUDA and virtualization. I hope, soon we will hear some success stories on these lines…

Best Regards,

Sarnath

I just read the tech-republic PDF on Hyper-V – It talks about Synthetic devices – where one can assign devices directly to guest OSes… If that works for GPUs or the TESLA, we should get CUDA going on Virtualization platforms…

Has any1 got this installed? (Note: Hyper-V is only for 64-bit platforms…)

Tech republic URL:
[url=“http://ct.techrepublic.com.com/clicks?t=72...EPUBLIC&s=5”]http://ct.techrepublic.com.com/clicks?t=72...EPUBLIC&s=5[/url]

Any updates from any1?
Able to run CUDA in virtual envmt…??