OpenAcc confusion about managed and uniform memory

Alot of online resources mention the flat -gpu=managed as managed or unified memory. But there is also a flag -gpu=uniform. What is the difference between these flags? Also, does using -stdpar automatically imply -gpu=managed? or -gpu=uniform?

What is the difference between these flags?

The “-gpu=managed” flag uses CUDA Unified Memory but via managed memory where the CUDA driver will implicitly move data between the host and device memories. However only heap memory can be managed.

With “-gpu=unified”, this support is extended to allow for all memory, including static and stack, to be managed. Full details can be found in a blog that I co-authored: https://developer.nvidia.com/blog/simplifying-gpu-programming-for-hpc-with-the-nvidia-grace-hopper-superchip/

However, “-gpu=unified” support is only available on systems where Heterogeneous Memory Management (HMM) is enabled in the Linux kernel. For systems without nvlink, such as x86+PCIe, full UM is functional but may not be performant. However as the Blog shows, our new Grace-Hopper systems can match the performance of manual data management when using unified for most codes.

The larger impact is that we’re also able to remove the restrictions for C++ STDPAR, including supporting capture by reference.

Also, does using -stdpar automatically imply -gpu=managed? or -gpu=uniform?

If on a Grace Hooper system, then it’s “unified”. Elsewhere, it’s “managed”.