I had a pgf77 code that could not be used for large arrays. So, I went ahead and wrote a pgf90 code with dynamic allocation and modules. I have made sure that the overall number of arrays created have reduced drastically. For same array size, I found that the pgf90 code was taking much less memory compared to the pgf77 code. But when I increased my array size, I find that, although only 85% of memory is being used up, the code started to swap heavily. However, the pgf77 code for the same array size did not. Basically, it appears that the threshold beyond which the code starts to swap is lower for pgf90 than it is for pgf77.
We are compiling the codes on an AMD Dual-Opteron (not dual core) 64 bit with Red Hat Linux 2.4.2 with 4GB RAM and 4 GB swap per node. The codes are being compiled with the PG compilers (Ver. 6.0) (we have to install the latest 6.1 yet) with the following options:
I don’t have real good advice for you but hopefully we can determine what’s going on. For clarification, you have written your program using two different methods. In the F90 version you allocate your arrays in modules while in F77 you have statically allocated arrays. What your seeing is that the dynamically allocated arrays seem to be taking up more memory than the statically allocated arrays, thus causing more page swapping.
First, the amount of memory used should only be slighty different between the dynamic and static arrays. Dynamically allocated arrays do need a descriptor, but this relatively small. Also, the F90 runtime can use more memory than its F77 counter part, but I’m not sure if this can account for the difference your seeing. One major difference is that dynamically allocated arrays are allocated on the heap, while staticly allocated arrays are placed on the stack.
Some things that would cause a large difference in the amount of memory used would be if you’ve forgotten to deallocate your allocated memory, or if your using the POINTER attribute instead of ALLOCATABLE and passing the array to a function without an interface. Since a POINTER many not be contiguous, a temporary copy of the array may be made when passing the array to a function.
Valgrind (found here) does have a heap profiler called “Massif” and a memory checker. While I have not used Massif, I’ve used the memory check quite often to find memory leaks. Hopefully Valgrind can help you too.