SIGFPEs and the Accelerators

TheMatt · August 14, 2009, 6:56pm

This is probably an unanswerable question, but I thought I’d pose it just in case.

I’m currently experimenting with using Accelerators in a well-established piece of software, trying to speed it up. I’m finding, however, that sometimes adding an

!$acc region

leads to SIGFPEs when there are none without the regions:

p0_2643:  p4_error: interrupt SIGFPE: 8

Is this a usual happenstance, or are there certain functions to avoid in accelerator regions that can lead to this? Like exp(), log()?

(Note, that is a p4_error meaning MPI, but this is mpirun -np 1. I’m not near using >1 CPUs yet.)

ETA: I’m currently trying to track down which bit of the code is doing this. In my idiocy, I accelerated lots of innocent bits of code. One (or more) turned out not to be so innocent…

MatColgrove · August 14, 2009, 7:38pm

Hi Matt,

One thing that may help in narrowing down where the FPE is coming from is to use the “-Ktrap=fp” flag. The flag will cause your program to abort when a FPE is encountered and tell you line line number where it occurred. It won’t trap code run on the accelerator, but at least give you a better idea of the cause. In particular, I’d be looking for a divide by zero.

Mat

TheMatt · August 17, 2009, 11:38am

Mat,

The code is compiled with -Ktrap=fp:

...-fast -Kieee -ta=nvidia -Minfo=all,accel -r4     -Mextend -Ktrap=fp  irrad.f

The problem is, it must have something to do with the accelerator because no line number is reported. Plus, as I’ve said, if I compile this for the host and not for the GPU, no SIGFPE is reported. Is there a way to get an approximate line number (say calling procedure level) with more verbosity?

I’m thinking, though, it’s time to one-by-one deaccelerate until I find it…

MatColgrove · August 17, 2009, 6:02pm

Hi Matt,

Is it possible to obtain the source? Either the issue is a difference between the accuracy of the GPU versus the CPU or it’s compiler bug. Either way, I’d like to take a look.

If it’s not publicly available, I can contact you directly or please send a note to PGI Customer Service (trs@pgroup.com)

Thanks,
Mat

TheMatt · August 19, 2009, 7:07pm

I’m currently trying to isolate the code responsible for this (so I don’t need to pass on thousands of lines of code…though I might have to) and have been observing some oddities.

In the beginning, the SIGFPE occurred if I accelerated a group of 4, 5 DO loops. When I think accelerated each loop, one-by-one, no FPE. If I then rejiggered the loop scheduling (as mkcolg showed me was important in another thread), added the useful copy/copyout/copyin, and then reaccelerated the entire group…no SIGFPE.

I think I might be passing along two sets of loops. One that causes the SIGFPE and one that doesn’t.

TheMatt · August 20, 2009, 7:22pm

Okay, I think I found a good code fragment to pass on. This isn’t the same one I was looking at…it’s one that’s more confusing. Confusing in that I’m not sure how an FPE is happening, and the math is simple. Coming your way, Mat…