I am not sure what -fast does to affect reading and writing binary files,
but I do know that the older releases did not allow I/O statements
to have I/O statements in them - for example
write (6, 100) foo(x)
and the function foo(x) had a I/O statement.
write (6,200) V*V +200
So if your code is doing that, you may see differences in performance
for I/O because we did not handle nested I/O statements before.
Could be the reason.
Other issues may be if you have a large array that you initialize
in the data declaration area, it could be quite slow
real*8 x[1000000000]=1234567.0D0
would cause your program to be much bigger than if you declare and initialize the big arrays at runtime.
real*8, allocatable :: x
.
allocate (x(100000000))
x=1234567.0D0
-Mvect=scalarsse is part of -fast, which is a collection of switches
-O2 -Mnoframe -Mvect=scalarasse
and in fact several bugs have been discovered and worked around
by using
-fast -Mnovect
so the scalarsse is not used.
Look for speedups from -fast when there is much computation,
and a lot of data reuse. Best example is Matrix Multiply, where the
data reuse can be exploited to drastically reduce the memory
references, and therefore speed up the program.