Consider the following code:
#include <nv/target>
void mem_fence() {
if target(nv::target::is_device){
__threadfence();
}
if target(nv::target::is_host){
}
}
__global__ void invoke(){
mem_fence();
}
int main(){
invoke<<<1,1>>>();
}
and compile with nvc++ -cuda -o test test.cpp
.
With all nvc++ versions we have tried up to 22.7 (21.7, 22.2, 22.7), this results in a linker error:
/usr/bin/ld: /tmp/nvc++eZ0OcmqorsPv3.o: in function `mem_fence()':
/tmp/test.cpp:7: undefined reference to `__builtin_is_device_code'
pgacclnk: child process exit status 1: /usr/bin/ld
The issue seems to be the empty if-target branch for host. Putting some code in there seems to resolve the issue, while having either branch empty triggers the error.
It also does not seem to be sufficient to simply put an empty statement like ;
inside the empty branches as a workaround - it seems to require actual code to work.
Of course, empty branches don’t seem to be particular useful, but they may appear without being explicitly put in by the application developer, e.g. when the branches contain macros which may expand to nothing depending on some build configuration. As such, I believe there is value in supporting this use case - on top of this just being a surprising and hard to debug error for users in some larger code bases :-)