I’m running a Fortran90 code on a 7 nodes cluster. The initial size of my program is around 50M. Each node has 512M of RAM. The problem with my program is that it keeps increasing its size until it exceeds the 512M of RAM so I always have a communication problem between nodes. I’m already using the ALLOCATE/DEALLOCATE statements so the program is supposed to free memory as it goes along. Nevertheless, the program grows little by little until it stops after around 10 hours. The puzzling thing is that my code is doing iterations so the first time it solves the problem it should use exactly the same amount of memory as any other iteration.
Does anybody have an idea of what is going on and how to solve it?
Thanks a lot,
It sounds like a memory leak. I’d try running Valgrind to see what you can find out. It’s only available for x86 so if your running on a 64-bit system, you’ll need to recompile your MPI libraries and application in 32-bits (-tp k8-32). Although I have not tried running Valgrind with an MPI application, I believe all you need to do is run your mpirun command with valgrind before your application name.