I am getting the following strange error on our Beowulf PC cluster
using PGI mpif77 / pgf77 while trying to restart my scientific
simulation:
PGFIO/stdio: Resource temporarily unavailable
PGFIO-F-/unformatted read/unit=1/error code returned by host stdio - 11.
File name = comm.1 unformatted, sequential access record = 1
In source file mydump.F, at line number 61
PSIlogger: Child with rank 0 exited with status 1.
PSIlogger: Child with rank 4 exited on signal 15.
PSIlogger: Child with rank 3 exited on signal 15.
PSIlogger: Child with rank 8 exited on signal 15.
PSIlogger: Child with rank 6 exited on signal 15.
PSIlogger: Child with rank 5 exited on signal 15.
PSIlogger: Child with rank 7 exited on signal 15.
PSIlogger: Child with rank 2 exited on signal 15.
PSIlogger: Child with rank 1 exited on signal 15.
PSIlogger: Child with rank 9 exited on signal 15.
PSIlogger: done
The code tries to read the file called comm.1, which contains all the
COMMON blocks in order to restart the scientific simulation from the
point where it previously stopped. The command ulimit in the console
shows me unlimited resources.
mydump.F, line number 61 contains the Fortran READ statement.
stdio error 11 (EAGAIN) is coming from the operating system and indicates here that the file is not available for some reason. First check that the file may be accessed by your process and that the process has read permissions. Also, the file may be locked by another process.
We’re not sure why this occurs but one engineer thinks that your NFS files system is not mounted as “hard”. If it was “soft” mounted, the NFS fileserver might be timing out.
My guess is that when you restart your simulation, either not all the previous process have died, or the file lock hasn’t been released yet. You can test this, by first running your simulation. Stop it, wait ~2 minutes, and then restart.
I have a few more emails out, but these are the best guesses so far.
Can you please send a report to trs@pgroup.com with instructions on how to reproduce this behavior? I’d like to have one of our engineers look into this and see if we can add any improvements.