Invalid Device when using open mpi to run multiple processes

ads1515 · August 3, 2017, 11:37pm

I am executing my code on an 8 gpu node with MPS on. I am trying to overload the GPUs by running 21 processes through MPI in this fashion:

mpirun -np 21 ./a.out

This run results in the following error:
call to cuDevicePrimaryCtxRetain returned error 101: Invalid device

When I run this on a machine with only a single gpu, no issues occur and it executes (inefficiently) through MPS correctly.

I am certain that it has to do with how I am calling ACC_INIT

ACC_NUM = ACC_GET_NUM_DEVICES(ACC_DEVICE_NVIDIA)
GPUNUM = MOD(MYID,ACC_NUM)
CALL ACC_SET_DEVICE(GPUNUM,ACC_DEVICE_NVIDIA)
CALL ACC_INIT(ACC_DEVICE_NVIDIA)
ACC_DEV = ACC_GET_DEVICE_NUM(ACC_DEVICE_NVIDIA)

Any help would be appreciated.

tull · August 4, 2017, 4:57pm

If you have 8 GPUs on your one platform and wish to use them all
simultaneously, the usual method is to run an OpenMP parallel section
in 8 threads on the CPU, where each thread assigns a different GPU, and then runs the GPU code on the assigned element. You can sync all the work at the end of the OpenMP section.

pgaccelinfo
will tell you what the compilers can see (8 GPUs?), to make sure
the compilers can access them.

A multi-process MPI program has to know which GPUs are available,
or it may end up just waiting for processes to end.

The GPUs do not do multi-tasking, they only run on job at a time. I am not sure overloading processes on the same platform to access individual GPUs will be successful.

dave

Topic		Replies	Views
Invalid Device when using open mpi to run multiple processes on a machine with 8 gpus CUDA Programming and Performance	1	613	August 4, 2017
Multi-GPU MPI launch failing when UVM enabled Legacy PGI Compilers	5	3771	January 2, 2019
MPI mixing host and gpu devices with PGI accelerator Legacy PGI Compilers	5	3934	December 7, 2011
call to cuLaunchKernel returned error 400: Invalid handle Legacy PGI Compilers	2	4316	May 13, 2019
Using multiple GPUs Legacy PGI Compilers	7	22076	August 11, 2009
cudaSetDevice failing Legacy PGI Compilers	5	7280	December 11, 2018
problem with multi gpu using mpi Legacy PGI Compilers	2	2174	December 2, 2015
about multi GPU control CUDA Programming and Performance	3	709	December 23, 2019
Multi-GPU with OpenMP CUDA Programming and Performance	3	2346	October 31, 2018
Multiple GPUs error cuLaunchKernel 400 Legacy PGI Compilers	8	15997	June 6, 2017

Invalid Device when using open mpi to run multiple processes

Related topics