OpenACC program takes two GPUs (instead of one)

sWienke · July 26, 2013, 9:15am

Hello,
Since the 13.x compiler (we use currently 13.6), we have severe problems with executing OpenACC programs on our nodes with two GPUs (We have two NVIDIA Quadro 6000 (Fermi) GPUs in each node). The problem is that any arbitrary OpenACC program takes BOTH GPUs (instead of only one). If we start e.g. a Jacobi solver on one GPU (without setting any device number), it runs on device 0 AND device 1. The program does neither elaborate on this nor prints output twice. But, you can still see both executions with “nvidia-smi” (see below).

>$ nvidia-smi
Fri Jul 26 10:42:01 2013
+------------------------------------------------------+
| NVIDIA-SMI 4.310.40   Driver Version: 310.40         |
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro 6000              | 0000:02:00.0     Off |                    0 |
| 30%   78C    P0    N/A /  N/A |   6%  324MB / 5375MB |     79%   E. Process |
+-------------------------------+----------------------+----------------------+
|   1  Quadro 6000              | 0000:85:00.0      On |                    0 |
| 30%   74C    P0    N/A /  N/A |   2%   98MB / 5375MB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     29538  ./laplace_openacc                                    362MB  |
|    1     29538  ./laplace_openacc                                    362MB  |
+-----------------------------------------------------------------------------+

Then the problem is that we have set our GPUs on the compute mode “exclusive process”. This prohibts any other user to start a GPU program if one OpenACC program (running on both) is executed which is really bad for us.
We have the same problem with MPI programs from a single user. If we have an MPI program with two processes running on one node and each process should actually talk to one GPU (according to the rank number), it does not work: While initializing the first device (acc_init) it takes both GPUs so that the second process get an error and finishes with context error.

Do have any ideas how to get a workaround? Will this be fixed in the next compiler releases (it was not an issue with 12.9 for example)?
Thanks, Sandra

MatColgrove · July 26, 2013, 5:10pm

Hi Sandra,

Thanks for the report. I was able to reproduce the behavior and sent a report (TPR#19494) to engineering. They should have a fix in place in the near future.

In the mean time, the work around would be to remove the call to acc_init.

Best Regards,
Mat

sWienke · July 30, 2013, 10:55am

Thanks Mat. The workaround is working for me.

tull · September 10, 2014, 12:13am

TPR 19494 - OACC: Using acc_init reserves all devices on a system
has been fixed in the 14.9 release of the PGI compilers.

thanks for your report.

dave

Topic		Replies	Views
problem with multi gpu using mpi Legacy PGI Compilers	2	2174	December 2, 2015
Multi-GPU MPI launch failing when UVM enabled Legacy PGI Compilers	5	3772	January 2, 2019
Using multiple GPUs Legacy PGI Compilers	7	22078	August 11, 2009
simple multi-gpu test program not working Legacy PGI Compilers	4	4093	June 14, 2013
MPI Multi-GPU process list in nvidia-smi nvc, nvc++ and nvfortran	9	1931	September 10, 2021
Reduction prevents parallel execution on two GPUs Legacy PGI Compilers	5	5696	March 11, 2014
Acc_get_num_devices only find one of the two GPUs (nvhpc/24.3) nvc, nvc++ and nvfortran	5	361	April 10, 2024
Parallel (async) execution of an OpenACC loop on multiple GPUs is not working when added a nested seq loop (Fortran) nvc, nvc++ and nvfortran	2	853	November 18, 2022
Openacc Example running slower with GPU nvc, nvc++ and nvfortran	7	897	June 19, 2022
How used my four gpu node Legacy PGI Compilers	6	4621	April 21, 2018

OpenACC program takes two GPUs (instead of one)

Related topics