How Can You Monitor Memory Usage on a XenServer Running XenApp in GPU Passthrough Mode?

One nice feature of the vGPU on XenServer is you can see with nvdia-smi how much memory is allocated and in use when running a VM. Is there a way to monitor how much memory is really in use by a VM, such as a XenApp server, that is running on a XenServer with GPU passthrough mode enabled, or is that redundant because it will always be all of it? The nvidia-smi utility doesn’t even show that an association exists, and xe pgpu-list and pggpu-param-list do not help. (*)

Also, if you run "xe vm-param-get param-name=other-config uuid=(UUID of the VM) params=all"
you get both a vgpu_pci and a plain pci parameter back. What exactly are the differences between these? They apparently match, so is it bad/dangerous if they do not?

Thank you in advance for any enlightenment.

*) There are certain reasons behind wanting to know how much memory is really allocated that have to do with pushing the envelope. :-)

Tobias, if I misunderstand then my apologies, but why not run NVIDIA-smi in the XA guest? The driver is there so smi is as well. Also you can run 3rd party tools as well. You are pinning the physical GPU to that guest so all of its resources are handed to it, not sure if dom0 taxes it some or no. Would be interesting to push it and see if you can get all 4GB of FB or if something is slicing a little off the top.

OK, so there is a Windows binary of nvidia-smi that gets installed along with the driver on the Windows VM? If that’s the case, I was not aware of that, so I will give it a go. Do you have the full path to it handy, by chance? I looked and did not find it anywhere.

Many thanks, Luke!

Tobias,

The path where nvidia-smi.exe command can be found at on a x64 or x86 version of windows should be:

C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe

Tobias,
Regarding your question about monitoring memory usage of a pass-through, there are two options that I am aware of.

  1. nvidia-smi within the pass-through VM, not in dom-0
  2. NVWMI performance counters within the VM via perfmom or other performance counter consumer.
  • This will allow you to create a collector set and alerts when your remaining memory reaches a certain limit or GPU utilization hits 100%, etc.

Thank you, Jeremy and Luke.

  1. On the dom0 XenServer side, nvidia-smi does not report anything back for any GPU passthrough instances, so you are saying that this same utility will indeed report the metrics if run on the XenApp server itself. This indeed appears to be the case. Check the attachment to see if the report generated looks right (I may have two passthrough instances applied, and the other two K1 engines are of course not accessible by the VM. as they are allocated to vGPU instances on the XenServer).

  2. I will have to take a closer look at the counters within the VM. (The perf meter graphs that show up on XenCenter are less than ideal.) Seeing what is going on directly on the VM that is providing the GPU passthrough would definitely be preferable. There should indeed be some load figures coming out from nvidia-smi.

Thanks very much again for your advice!

Update: Indeed, the embedded NVIDIA-smi.exe routine in the VM appears to work as you said. Nice!
XenApp_nvidia-smi.JPG