pgfortran compiling my CUDA Fortran code is ignoring the operator ^ while it says pow cannot be used in device kernel (allowed only in emulation mode).
I found conflicting information about support for math funcitons
PGI Webpage says
Q Does the compiler support IEEE standard-floating point arithmetic?
A The GPU accelerators available today support most of the IEEE floating-point standard. However, they do not support all the rounding modes, and some operations, notably square root, exponential, logarithm, and other transcendental functions, may not deliver full precision results. > This is a hardware limitation > that compilers cannot overcome.
That’s right, of course. I was rather vague in posting the issue.
Actually even ** was not accepted in a device kernel by the compiler(pgfortran).
PGF90-S-0000-Internal compiler error. unsupported operation 185
It sort of puts me in a fix because our fortran code uses the following functionalities (double precision) **
sqrt
exp
So without them, I will not be able to port it to CUDA Fortran. Would you mind providing some pointers to literature, if they are already supported (double precision)?
These routines are all supported in CUDA Fortran so something else is going on. Can you please either post or send to PGI Customer Support (trs@pgroup.com) a reproducing example?
I tweaked the kernel a little in the meanwhile and fixed the compile error. For some reason, 10zn2 is not valid, while 10.0zn2 is proper
Unsupported kernel:
attributes(global) subroutine addnum_kernel( zdarr1, zn2, zdarr3 )
integer, device :: zn2
integer,device :: zdarr1(10), zdarr3(10)
integer :: ix
do ix = 1,10
zdarr3(ix) = zdarr1(ix)*(10**zn2)
end do
end subroutine addnum_kernel
Proper, supported kernel:
attributes(global) subroutine addnum_kernel( zdarr1, zn2, zdarr3 )
integer, device :: zn2
integer,device :: zdarr1(10), zdarr3(10)
integer :: ix
do ix = 1,10
zdarr3(ix) = zdarr1(ix)*(10.0**zn2)
end do
end subroutine addnum_kernel
Aditya
PS.: There are issues with declaring zdarr3 etc as integer, just to warn other fellow CUDA programmers
It looks like a compiler issue to me so I created a problem report (TPR#18956) and sent it on to engineering. You’ve already determined that the work around is to use a real instead of an integer for the base value.