We are porting a package from Windows to linux (32 bit Redhat 9).
We use openmp to compile since our Windows and linux have multiple
On Windows, we got good performance with Intel Intel Fortran. However,
using 1 CPU and 2 CPUs to run on linux got about the same performance
(same elapsed/CPU time with the linux having very little load).
On linux, we have PGI workstation 7.2-3 and benchmark code pi.f works
perfectly. Therefore we are confident that PGI fortran has been installed properly.
The bizare thing is the following. Command top did show both CPU0 and
CPU1 with nearly 100% usage (the executable is the only main load while
running our package). This is for OMP_NUM_THREADS=2. Besides, the
executable (only one shown) has correct increase under column TIME,
which is about 10 second increase at each 5-sec flash.
With OMP_NUM_THREADS=1, it is 5 second increase under column TIME
to corresponding executable at each 5-sec flash. And the elapsed time
is about the same as running the executable with OMP_NUM_THREADS=2.
Any opinion on this situation (no performance advantage with 2 CPUs)