I’m using a fortran code (f90) for MonteCarlo calculations. During the execution the program uses a lot of times the intrinsic function “anint(x)” to get the integer value (returned as real) nearer to x.
I tried to use the same program, written in C, and the performance is better than fortran (about 30%). After some tests, I realized that the function “anint” in fortran is slower than the correspond “rint” in C; in fact if I don’t use these functions in two programs, the fortran code is faster.
So, there is a solution to use the “rint” C function in the fortran code?
Thank’s a lot.
Chapter 10 of the PGI user’s guide gives a good introduction to inter-language calling conventions. Basically, you need to create a C wrapper function which calls rint. Note the extra underscore after the function name and the use of pointer arguements. Also note that rint is a macro that complains if its compiled with pgcc, so you’ll need to compile this file with gcc.
Example:
myrint.c
#include <tgmath.h>
void myrint_ (float * val, float * rval) {
*rval = rint(*val);
}
test.f
program testing
real x,y
x = 123.1456
call myrint(x, y)
write(*,*), 'RINT(', x, ')=', y
y = 0.0
y = anint(x);
write(*,*), 'ANINT(', x, ')=', y
y = 0.0
x = 123.5678
call myrint(x, y)
write(*,*), 'RINT(', x, ')=', y
y = 0.0
y = anint(x);
write(*,*), 'ANINT(', x, ')=', y
end
Command line:
quartet:/tmp% gcc -c myrint.c
quartet:/tmp% pgf90 -c test.f
quartet:/tmp% pgf90 test.o myrint.o
quartet:/tmp% a.out
RINT( 123.1456 )= 123.0000
ANINT( 123.1456 )= 123.0000
RINT( 123.5678 )= 124.0000
ANINT( 123.5678 )= 124.0000
Let us know if this helps performance since I’m very curious why anint is so much slower.
- Mat
Thank’s for help!
I tried to do this, but the needed time for execution is longer.
In fact by using ANINT (in fortran), the required time is about 33% less than by using RINT (in C).
Probably the problem is another!
I ask you another question: why by declaring DOUBLE the myrint function is very and very slower than with FLOAT?
Thank’s a lot.
Stefano
Off hand I don’t know why the double version is so much slower. When you compile the with “gcc -S” and view the generated assembly file the float version calls “rintf” and the double version calls “rint”. My guess is that is simply takes more instructions to convert a double to and int than it does a float.
Have you tried compiling with a profiling flag and using the PGI profiler pgprof? This tool can really help in determining performance bottlenecks. A complete description of pgrof can be found in the PGI tools guide. PGI version 20.4 Documentation for x86 and NVIDIA Processors
- Mat