fftw shockingly slow

Hello:

I have compiled some physics code using fftw with pg… v. 6.2-3 and with icc/ifort v. 9.1 to compare performance.

For intel, I used the -fast option, and found the fftw3.1.2 part runs in 1 minute, which is a little less than what I get using gcc.

Then, I compiled with the Portland Group compiler (both the fftw libraries and my program, of course), and the same fftw takes 10 minutes!

After trying a few different sets of options including just -fast -tp amd64, I found the best one is -O0 -tp amd64, for which the fftw part still takes 3.5 minutes!

I am quite sure there is something funny going on, but I have no idea what it is. Has anyone here seen something like this before?

Thanks,
Chris

Hi Chris,

After trying a few different sets of options including just -fast -tp amd64, I found the best one is -O0 -tp amd64, for which the fftw part still takes 3.5 minutes!

I am quite sure there is something funny going on, but I have no idea what it is. Has anyone here seen something like this before?

This is very odd especially that unoptimized code would run nearly three times fast than optimized code. Is it possible to obtain the source, workload, and build instruction for your physics code? I’d be very interested in determining where the slow down is.

Thanks,
Mat

Hi, Mat

Thanks for your quick reply. I have been away for a while, and just found it. I don’t think sharing the code is a problem, but I have to talk to a few people first.