Multi GPUs using MPI in Unified Memory

Hi,

Multi-GPU Unified Memory and Communication.

From this thread, it was told that in a Multi-GPU system with a Unified Memory, MPI is used for communication/data transfer between GPUs.

However, if the memory is unified, that means that all the GPUs have the same memory address space right?
Why is there is a need for MPI if all the processors have the same address space? Can’t the memory locations be just accessed by loads and stores?

Is there any other way a multi GPU can be programmed apart from using MPI?

While the GPUs do have a Unified Memory space, each MPI process has it’s own address space that’s not shared between processes. For multi-process, there is NVSHEMM which can give a unified view, but that’s a separate model that needs to be added and not part of MPI.

There’s also OpenMP which uses a single process with multi-threads which can then be used with multiple GPUs with a single unified address space. The biggest challenge with OpenMP is getting the discrete device memories coherent and sync’d with the host, but this issue is moot with UM. You still need to use the “device” clause so the computation goes to the correct device, but could be a viable option for you.