Unified memory - more than 1 GPU

hhward · January 8, 2019, 2:46pm

Hi

Is there a plan to add support for unified memory with more than 1 GPU?

MatColgrove · January 8, 2019, 3:46pm

Hi hhward,

CUDA Unified Memory support on more than one GPU has been available for quite some time.

See: Programming Guide :: CUDA Toolkit Documentation

-Mat

hhward · January 10, 2019, 7:11am

Thanks for your reply.

Is it supported with openacc too?

MatColgrove · January 10, 2019, 3:30pm

Is it supported with openacc too?

Yes, PGI’s implementation of OpenACC does support CUDA Unified Memory. You can enable it via the compiler flag “-ta=tesla:managed”.

For details please see: User's Guide :: PGI version 18.10 Documentation for x86 and NVIDIA Processors

Hope this helps,
Mat

hhward · January 17, 2019, 3:26pm

Thank you for your reply.

I have tested this, but it seems that it is only using one GPU.

I have a test case where I have 17.7GB of data on a computer with four P100(16GB pr card). The memory usage(using nvidia-smi) shows that it is only using one of the cards.

pgfortran test.f90 -o xacc -mp=allcores -ta:tesla:managed

is used to compile.

It looks like all calculations are done on one card.

Do you have any idea what’s wrong?

MatColgrove · January 17, 2019, 9:10pm

How are you assigning the OpenMP threads to the GPU devices?

Are the OpenACC regions within the OpenMP parallel regions?

To set the device number, you’ll want something like this in early in the code:

     use openmp
     use openacc

     devNum = acc_get_num_devices(acc_get_device_type())
!$omp parallel private(thid,dev)
     thid = omp_get_thread_num()
     dev = mod(thid,devNum)
     call acc_set_device_num(acc_get_device_type(), dev)
     call acc_init(acc_get_device_type())
!$omp end parallel

The OpenMP threads retain the same device for subsequent parallel regions.

! run parallel host threads
!$omp parallel loop
do i=1,N
...
! offload to the device
!$acc parallel loop
do j=1,M
...

Hope this helps,
Mat

Topic		Replies	Views
Using Unified Memory on GeForce devices Legacy PGI Compilers	1	1515	February 8, 2018
about managed memory Legacy PGI Compilers	1	1775	October 9, 2017
Problem using GPU in OpenMP program Legacy PGI Compilers	1	1578	July 31, 2018
FAQ: Multiple GPU Support Legacy PGI Compilers	0	8517	June 18, 2009
multiple gpu and unified memory CUDA Programming and Performance	3	4538	March 29, 2022
CUDA Unified Memory By PGI Legacy PGI Compilers	5	5630	April 6, 2016
examples for using multiple GPUs and one CPU? CUDA Programming and Performance	2	5230	February 14, 2011
Parallelize across CPU and GPU cores simultaneously Legacy PGI Compilers	3	5220	January 6, 2016
Some questions Legacy PGI Compilers	3	3472	September 29, 2016
Use 1 GPU from multiple CPU threads CUDA Programming and Performance	0	2151	July 5, 2010

Unified memory - more than 1 GPU

Related topics