openACC vs. CUDA

Hi JMa,

these are kind of high-level question, so let me try to give you some high-level answers as well.

This is an interesting question and there doesn’t seem to be just one answer to it. IMHO, I think that OpenACC will persist for at least the next few years (whether as OpenACC or as a part of OpenMP 4.0), however, it is likely that low-level programming models will still exist in the near future because they offer the programmer the possibility to highly tune it’s application (and that’s what HPC is about, right?).
On the other hand, OpenACC eases the way we program coprocessors. So I think that both approaches could benefit from each other. E.g.: Use OpenACC for the easy stuff and manually fine-tune some compute-intensive kernels with CUDA.

The only situation, that I can think of, which would make CUDA redundant would be if the compilers would become so powerful that they generate the same high-performance code that you could achieve with CUDA (or at least within some few %). I’m not a compiler engineer but it doubt that this is what we’ll see in the next few years.

So there are some limitations as of right now:

  • Function calls are not yet supported within parallel regions (unless they can be inlined)**
  • No nested parallelism is allowed**
  • Another limitation is that you as a programmer can not use CUDA intrinsic functions (e.g. warp functions) within your accelerator region. But this is the way it is suppossed to be for a directive-based approach - it should be easy to use and portable across different architectures (i.e. no intrinsics).

I hope that this answers some of your questions.

Best,
Paul

** These features will be implemented in OpenACC 2.0 (so there’s hope :))

1 Like