Trivial question on memory managed and unified memory.

AntonioR · January 17, 2019, 5:59pm

Hi all,

I have a pgi+openacc code and I am trying to set it to use unified memory.

I read online that in order to activate unified memory, I do have to set the flag “-ta=tesla:managed” .

Is it all? or do I have to change my mallocs to cudaMallocManaged? and should I strip the copyin/copy/copyout calls from my “#pragma acc” statements?

I ask because, after settng -ta=tesla:managed, the nvprofiler still shows Host->Device and Device->Host copies of data with identical timings (of both copies and the kernel).

Or could it be that nvprofile somehow disables the unified memory (sorry for the trivial question, I am very new to this)

Thanks.

MatColgrove · January 17, 2019, 8:50pm

Is it all?

Yes, that’s it!

What happens when you enable this option, the compiler will replace all of your malloc/new/allocate calls with the “managed” version. We use a manged pool allocator so the code is not calling cudaMallocManaged directly, but it will be managed.

and should I strip the copyin/copy/copyout calls from my “#pragma acc” statements?

No need to do this. The compiler runtime will check if the variable is managed or not. If it is managed, then the data clause is essentially ignored.

the nvprofiler still shows Host->Device and Device->Host copies of data with identical timings (of both copies and the kernel).

Without specifics, it’s difficult to say exactly why this is happening. However, keep in mind that managed memory is only currently available for use with dynamic data. So if you’re code is using fixed size arrays or objects, then these object still need to be manually managed.

Also, you need to make sure that you link with “-ta=tesla:managed” as well. Otherwise the runtime check to see if it’s managed isn’t used.

For manged memory, the profiler should have a row which shows the relative “heat” of the page migration between the host and device. It wont show the individual data copies like it does with the data regions.

Or could it be that nvprofile somehow disables the unified memory

It’s enabled by default, but it is possible to disable it when you create a session. Plus you need a device that’s capable of supporting unified memory.

-Mat

AntonioR · January 18, 2019, 10:02am

Great, thanks!

gxming · June 21, 2024, 9:13am

I’m coding a program which runs on multi GPUs in a single node, I want to use the managed memory to allocate the variables on different GPUs, and use the standard parallel fortran. Could you give us an example in this situation ?

MatColgrove · June 21, 2024, 5:20pm

Hi gxming,

CUDA Unified Memory (aka “managed”) works across multiple GPUs on a system (see: HERE), so you don’t really need to do anything special. This would be if you’re using a single process to manage multiple GPUs, such as via OpenMP or programmatically changing the devices.

Though I prefer to use MPI for multi-gpu programming. In which case each rank has it’s own CUDA context and therefore it’s own CUDA Unified address space. So again, not a problem but the UM is not shared across ranks.

This article may be helpful:

-Mat

Topic		Replies	Views
CUDA Unified Memory By PGI Legacy PGI Compilers	5	5632	April 6, 2016
Unified memory in Pascal architecture. CUDA Programming and Performance	1	721	August 4, 2017
multiple gpu and unified memory CUDA Programming and Performance	3	4608	March 29, 2022
NVPROF is causing system instability and requiring reboot CUDA Programming and Performance	8	916	February 19, 2018
Should tesla:managed ignore data clauses? Legacy PGI Compilers	4	4810	February 1, 2017
"Unified Memory Profiling is not supported ..." warning 3348 Visual Profiler and nvprof	15	5768	September 20, 2018
Issue when using PGI unified memory Legacy PGI Compilers	3	2870	July 13, 2016
Problems with Unified Memory Under Pascal CUDA Programming and Performance	2	1083	January 24, 2017
about managed memory Legacy PGI Compilers	1	1778	October 9, 2017
Asynchronous memory transfer on Jetson TX1 Jetson TX1	10	1619	October 18, 2021

Trivial question on memory managed and unified memory.

Related topics