Hi there,
this is a very specific question, but I think this is the only place that has a chance of getting an answer :-)
On an ESXi host, we want to use 4 RTX Pro 6000 Blackwell GPUs (mainly for inference).
The host has 2 NUMA nodes.
We want to use vGPU for better flexibility, but for now we are using only “full” vGPU instances (vGPU frame buffer size = physical GPU frame buffer size).
What would be the most efficient topology for this scenario?
- Connect all (4) physical GPUs to the same NUMA node
- Evenly split all physical GPUs (2, 2)
We understood that P2P functionality is not available (at least not via PCIe).
Is UVM performance significantly affected by NUMA placement?
Best