I’m new at OpenACC, and I’d like to know if it is possible to use atomic operations with OpenACC. In the GPU code I’m trying to port to OpenACC, there is a LARGE matrix that needs to be updated by the threads in parallel, and since it’s quite large using private is not an option. Is it possible to use atomic with OpenACC or do you suggest any other away to handle it?
OpenACC is designed to be device neutral, hence NVIDIA specific features such as atomics are not supported.
OpenACC is interoperable with CUDA C/Fortran, so you may consider writing this one kernel directly in CUDA if you must use atomics.
Hope this helps,
Sure it helps. When developing code in OpenMP this kind of situations happened to me, and at that time I could use, for instance, atomic operations which are compiler built-in (gcc) inside a parallel loop. Would I be able to do something similar with OpenACC? Calling a specific atomic operation inside a parallel loop or is it compiler specific?
OpenACC and CUDA are interoperable but can’t be mixed together so no you could put the CUDA atomic inside an OpenACC region. Though, I’ll ask our OpenACC representative to see what, if anything, can be done to the standard to allow such extensions.
I would very much appreciate it.