I have a finite difference solver written in Fortran 90 with HPF directives inserted. I’m am running a dual proccesor xeon with 2Gb of mem.
If I use the intel compiler v8 ifort with vectorisation turned on my code executes 10 timesteps in 30s, running in serial, compiling the same code with the pghpf compiler results in an execution time of 2m30s.
If I add the -Mf90 switch this comes down to the intel time of 30s
Since I have a dual proccesor machine I would like to run a smp setup.
Compiling using -Msmp with the pghpf compiler reduces the run time to 1m18s,running at around 90% on both cpus,however if I add in the -Mf90 switch, the two processors both run the same code with no communication.
The intel compiler refuses to recognise parallel sections of code(complains about not being able to read the trip count) and therefore I cannot use this to compare times for an smp setup.
My question is why does the -Mf90 switch seem to disable or overide the -Msmp switch,I am still getting messages about loops being parallelised,its just when I execute the code it simply runs two separate processes.
The compilations switches and commands I have used are as follows
(2m30s serial,1m30s smp)
Pghpf -Mautopar -fastsse -O2 -tp p7 -Mcache_align -Mvect=sse -Mconcur=assoc -Minline=levels:10 -Minfo=all -Minform=inform -Mnofree -Msmp fd3d_map.f90 fd3drout1.f90 diffrout1.f90 -o pghpf_fd3d.x
adding -Mf90 reduces serial to 30s,but does not work for smp.
./pghpf_fd3d.x -pghpf -stat alls
./pghpf_fd3d.x -pghpf -np 2 -heapz 150m -stat alls