OpenACC specification (version 1.0 and later) says that if the acc_on_device routine has a compile-time constant argument, it evaluates at compile time to a constant.
However, the NVHPC OpenACC compiler (nvc++ V22.11) does not convert the acc_on_device routine with a compile-time constant argument to a constant at compile time.
Engineering took a look but unfortunately it’s not possible to use “acc_on_device” within a constexpr. The problem has to do with phasing. Constexpr is evaluated by the front-end, but it isn’t until the back-end compilation when the device, host, or both code generation occurs and “acc_on_device” can be evaluated.
“acc_on_device” is a compile time constant and should work as expected if removed from the constexpr.
Thank you for the explanation.
Then, is there any nvc++ option that enforces the compiler to remove not-taken path of an if-else statement at compile time? In the above example, the device version of foo()will not execute the if-path (printf("host\n");) since acc_on_device(acc_device_host) in the device version of foo() will be evaluated to false at compile time.
In the back-end, “acc_on_device” does get turned into either true or false depending if it’s generating either device or host code. Then dead code elimination will remove either the true or false part of the conditional. No additional flags required.
For example, here’s the generated CUDA code. As you can see, there’s no if condition used:
Note that the old CUDA code generator (i.e. -gpu=nollvm,keep) isn’t supported any longer. It’s just convenient to show this. You can see the same thing in the generated LLVM code as well (i.e. -gpu=keep), it’s just harder to read.
I verified the same behavior on a local test too; thanks.
The reason why I asked the behavior of if constexpr(acc_on_device(acc_device_host)) … is because I wanted the compiler front-end to eliminate the if-path code before the device-specific back-end generates the device code.
For example, in the example below, hostfunc() is defined in an external library that works only on the host, and thus the OpenACC back-end will not be able to handle it properly.
Is there a way for nvc++ to eliminate certain code in the front-end pass depending on whether it will be executed on the host or device?
(Based on your previus explanation, it seems no, since the nvc++ front-end is oblivious of the device-specific behaviours in the backend.)
It still does this, but just does it in the back-end. You shouldn’t need constexpr.
Granted, this is a simple case, but I re-wrote your example to use a library for the host call and it works fine. Did you try this with your real code? If you tried and encountered an issue, can you post a reproducing example?