CUDA Fortran Kernel Double Precision Array

Sam_Murdoch · September 11, 2019, 9:51am

Hi,

I’m using a NVIDIA GeForce RTX 2080 Ti card.

I have a CUDA Fortran kernel into which a number of DOUBLE PRECISION arrays are passed.
The kernel calculates a value based on these for the given thread index (I), then attempts to store it in an array, R, declared:

DOUBLE PRECISION, INTENT(OUT) :: R(:)

For illustration purposes, the calculation is

(a(I)*b(I)) + c + d + e + f + g + h

When I do

WRITE(*,*) (a(I)*b(I)) + c + d + e + f + g + h

in the kernel, I can see the value of the term correctly is 4.4408920985006262E-016

When I set:

R(I) = (a(I)*b(I)) + c + d + e + f + g + h

then

WRITE(,) R(I)

The value of R(I) is zero.

I know that the values are on the boundaries of machine precision so this must be significant. If I explicitly set R(I) to some small constant, for example

R(I) = 5

Then everything works as expected, so I don’t believe there is anything wrong with the process of calling and returning values from the kernel.

Is there a precision limitation that applies or compiler flags that I could be missing?

Any help here would be greatly appreciated.

Thanks

MatColgrove · September 11, 2019, 4:12pm

Hi Sam,

Is there a precision limitation that applies or compiler flags that I could be missing?

FMA is enabled by default in device code, but I highly doubt that if would cause this issue. Though, you can try compiling with “-Mnofma” to disable this.

It doesn’t quite make sense why printing “R(I)” would differ from printing the computation directly. The compiler would need to generate a temp variable to hold the result before printing, which shouldn’t be different than if it were stored to R.

Can you post or send to PGI Customer Service (trs@pgroup.com) a reproducing example? That would help to determine what’s going on.

Thanks,
Mat

Topic		Replies	Views
Double Precision errors Legacy PGI Compilers	5	2609	June 12, 2018
new compiler gives error Legacy PGI Compilers	13	8935	February 26, 2013
compiler options for REAL arithmetic Legacy PGI Compilers	1	1947	October 30, 2010
max number of arguments for kernel function Legacy PGI Compilers	2	7996	April 20, 2013
My fortran CUDA program does not work, ask for help Fortran double precision matrix multiply CUDA Programming and Performance	4	2511	January 16, 2009
Fortran Formatting Oddity Legacy PGI Compilers	4	996	November 21, 2019
The kernel isn't working CUDA Programming and Performance	9	1053	January 19, 2011
Double Precision Problem Legacy PGI Compilers	8	4964	November 12, 2010
double precision problem in Calling cuda from Fortran CUDA Programming and Performance	0	2443	August 19, 2009
CUDA Fortran strange behaviour using automatic arrays and kinds in kernels Legacy PGI Compilers	3	3986	February 11, 2020

CUDA Fortran Kernel Double Precision Array

Related topics