are you compiling with support for double precision? if not, it’s probably picking up the host’s pow(double, double) instead of the device’s pow(float, float)
I had a similar problem, and I do compile with “nvcc -arch=sm_13”.
It seems that CUDA simply doesn’t have declarations of pow(double,float) and pow(float,double), while it DOES HAVE pow(float,float) and pow(double, double). It can be easily seen here:
Here the lines with rdd and rff compile ok, but the rfd and rdf give out the error " error: calling a host function from a device/global function is only allowed in device emulation mode".
A simple fix is to explicitly cast both arguments either to float or to double.