CUDA Unified Memory By PGI

chu47579 · March 28, 2016, 2:37am

Hi,

After I use “–ta=tesla:managed” flag to compile the code, How can I know what data management could be used.

PS: I’ve assigned PGI compiler to compile an open source software, and I find that it seems to have no effect when I add the “–ta=tesla:managed” flag.

best,

Jackie

MatColgrove · March 28, 2016, 5:56pm

Hi Jackie,

Setting the environment variable “PGI_ACC_NOTIFY=2” will show the data transfers between the device and host. If you compile with “-ta=tesla” and see the data transfers, then compile “-ta=tesla:managed” and no longer see data transfers (since they would now be handled directly by the CUDA driver), then you know for certain that CUDA Unified Memory is in effect. You can also run your code through nvprof and see if there are differences.

Note that CUDA Unified Memory is only available for dynamic memory. Static data still needs to be managed via OpenACC data directives.

I’ve assigned PGI compiler to compile an open source software, and I find that it seems to have no effect when I add the “–ta=tesla:managed” flag.

Does the code contain OpenACC directives?

Mat

chu47579 · March 29, 2016, 6:48am

Hi Mat,

Firstly, I have some troubles with setting the environment variable “PGI_ACC_NOTIFY=2”, did I just add the flag to configure?

Secondly, If I don’t manage static data when I use the CUDA Unified Memroy, what’s gonna happen

Thirdly I would say “Yes”, the code contain many OpenACC directives. Anything’s I have to keep an eye on to make a sufficient use?

best,

Jackie

MatColgrove · March 29, 2016, 3:38pm

Hi Jackie,

Environment variables are set in your shell. If you are using csh, use “setenv PGI_ACC_NOTIFY 2”. For bash use “export PGI_ACC_NOTIFY=2”. Set this before running your program.

Since the program uses OpenACC already, the data movement would be handled by the directives and the OpenACC runtime. CUDA Unified Memory would only be used for dynamic data and override the management by the OpenACC runtime. Static data would still be managed by the OpenACC runtime.

Mat

Miko_Virgoez · April 4, 2016, 2:12am

thank you, it has helped me.

chu47579 · April 6, 2016, 1:07am

Hi Mat,

I get some wrong informations for using CUDA Unified Memory.

malloc: cuMemMallocManaged returns error code 8010: ALLOCATE: 400000 bytes requested; not enough memory

I list my memory informations as follow.

[webber@localhost RE__DEPLOYMENT_RESTRICTED]$ free
total used free shared buff/cache available
Mem: 3806804 1411072 157988 6308 2237744 2054228
Swap: 10485756 244928 10240828

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
±----------------------------------------------------------------------------+

best,

Jackie

Topic		Replies	Views
Trivial question on memory managed and unified memory. Legacy PGI Compilers	4	1430	June 21, 2024
PGPROF giving error of incompatible CUDA driver, except when using unified memory with pgc++ Legacy PGI Compilers	10	8981	January 24, 2020
Issue when using PGI unified memory Legacy PGI Compilers	3	2870	July 13, 2016
Direct GPU-to-GPU data transfer with OpenACC+managed+MPI nvc, nvc++ and nvfortran	4	1149	April 12, 2022
Unified memory - more than 1 GPU Legacy PGI Compilers	5	2704	January 17, 2019
Should tesla:managed ignore data clauses? Legacy PGI Compilers	4	4810	February 1, 2017
analysis of memory usage on GPU Legacy PGI Compilers	4	5735	March 15, 2016
pgf90 + openacc & managed memory / um-evaluation package Legacy PGI Compilers	8	8913	June 16, 2015
Code works with PGI_ACC_DEBUG=1 but fails without it Legacy PGI Compilers	5	4141	October 19, 2017
Pinned memory for asynchronous data transfer with OpenACC Legacy PGI Compilers	5	3337	February 6, 2019

CUDA Unified Memory By PGI

Related topics