We have a setup in one Workstation with two Virtual-Machines. Each machine has a GPU, and they form a sort of a processing-visualization pipeline: the results processed in the linux VM are then visualized in Windows.
The communication between the two VMs happens through the Hypervisor (virtual-network). This involves copying the information from one GPU to the linux-vm, and then to the windows-VM and then to the other GPU. This is CPU intensive and we would like to explore the GPU Direct technology for it.
The question is:
Would GPUDirect work in this PCI-Passthrough setup? Kind of only having the network interfaces and a cable doing a Loop-back to the same machine?
We this we want to offload some of the load on the CPU/Hypervisor.
I am curious about why you need two systems for one Task. Is it possible to migrate the application on Linux to Windows since we have CUDA, TensorRT supported on windowns?
Well, the HW itself can be upgraded. We are using now a RTX4000 in Linux, and a GTX 1070 in Windows. But we are more interested in the general question of whether or not a GPUDirect link could be done between two GPUs in the same system, to avoid overloading the CPU with copying of the data.
To confirm, if I am running two VMs, both running Linux, each with it’s own independent GPU assigned via PCIe Passthrough, I should be able to initiate GPUDirect communication for GPU Direct RDMA transfers over PCIe, not hitting shared CPU memory?
As followup: Would this work with two Windows VMs? Would GPU Direct between the two VMs over a physical NVLink be supported on either Linux or Windows, to accelerate transfers?
I’ve wondered about this for a long time, hopefully you’ve run across some use cases like this.