Unfortunately I can’t do this as I work on a Cray XT5 supercomputer that doesn’t support dynamic libraries, preloads etc. which Valgrind requires. I’ve discussed this with the developers already and they basically said Valgrind’s functionality would be very limited without the above.
I do agree with you though on it being some sort of memory problem. Prior to the current state of the code I was able to highlight some places were the code was apparently giving some problems (using a debugger)…although the cause was strange to say the least.
In particular the seg faults initially were occuring at array initialisation statements of the form
A = 0 (where A is a 2D allocatable array).
The only way I could get execution past this point without seg faulting was to explicitly put in the dimension ranges as follows:
A(1:n, 1:m) =0
This happened in about 3 cases. In each case checking the allocation status indicated the arrays were created successfully. With that done the code still seg-faults but I’m not really able to identify where. I think I have just been trearting the symptoms at the moment as opposed to the cause.
I did notice though when I compile with ‘-C’ (bounds-check) the code runs to termination successfully.
The Pathscale compiler also seg-faulted at the same locations I discussed above but by making the explicit dimension changes the code now runs to completion with or without any equivalent bounds-checking option.
Even stranger…when I compile the code with gfortran the code runs successfully to completion without any code modifications or debugging compiler options!!!
This is parallel MPI code that I am porting from a Blue Gene/L supercomputer to the Cray XT5. The developers apparently compile it on their BG/L, I assume with the IBM compiler, and don’t experience any problems.
Sorry for the mini-novel but I just wanted to give you some background info in case it may spark some sort of ideas into what is happening. I’m just not sure if this a coding problem or a compiler bug.
Thanks again for any tips you can pass on to identify the source of the problem.