I have come across a problem that I think may be an issue with either the compiler or OpenMPI.
I have a multi-GPU MPI+OpenACC code that uses CUDA-aware MPI through the host_data clause.
For testing purposes, I have in the past been able to run the code using 2 GPUs on a machine that had a GTX 750TI and a TitanXP using both GPUs. In that case the 750TI was also being used to run the graphics windowing system (MATE).
My system now has a GTX 1050TI and a RTX 2080TI with the 1050TI running the graphics. The code now crashes when trying to use both GPUs (or just the 1050TI).
On another machine, I have a single RTX 2070 that runs the graphics. Running the code on that single GPU also crashes in the same manner. If I disable my windowing system (server mode), the code runs fine. [Note that the CUDA-aware MPI is still being used even with 1 GPU due to a periodic domain seam).
The only common denominator I can see is that using CUDA-aware MPI on a GPU that is also running graphics seems to not work when the GPU is Pascal or above (since it DID work with the 750TI).
The crashes happen shortly into the run but not right away, taking a random number of steps before it happens.
On the system with the single RTX 2070 running graphics, all the CUDA 10.1 sample programs ran fine including the multiple GPU tests. This leads me to think it is an openmpi or pgi issue.
All systems were running Linux Mint 19.
The crash spits out:
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution call to cuMemFreeHost returned error 700: Illegal address during kernel execution
I know it is not necessarily common to run computation and graphics at the same time but it is useful for testing.
I think the POT3D code I have previously sent you could reproduce this problem by switching between cuda-aware MPI or not.