STDIO - Error 11: Resource temporarily unavailable

Hello,

I am getting the following strange error on our Beowulf PC cluster
using PGI mpif77 / pgf77 while trying to restart my scientific
simulation:

PGFIO/stdio: Resource temporarily unavailable
PGFIO-F-/unformatted read/unit=1/error code returned by host stdio - 11.
File name = comm.1 unformatted, sequential access record = 1
In source file mydump.F, at line number 61
PSIlogger: Child with rank 0 exited with status 1.
PSIlogger: Child with rank 4 exited on signal 15.
PSIlogger: Child with rank 3 exited on signal 15.
PSIlogger: Child with rank 8 exited on signal 15.
PSIlogger: Child with rank 6 exited on signal 15.
PSIlogger: Child with rank 5 exited on signal 15.
PSIlogger: Child with rank 7 exited on signal 15.
PSIlogger: Child with rank 2 exited on signal 15.
PSIlogger: Child with rank 1 exited on signal 15.
PSIlogger: Child with rank 9 exited on signal 15.

PSIlogger: done


The code tries to read the file called comm.1, which contains all the
COMMON blocks in order to restart the scientific simulation from the
point where it previously stopped. The command ulimit in the console
shows me unlimited resources.

mydump.F, line number 61 contains the Fortran READ statement.

Thanks for any help,

Andreas

Hi Andreas,

stdio error 11 (EAGAIN) is coming from the operating system and indicates here that the file is not available for some reason. First check that the file may be accessed by your process and that the process has read permissions. Also, the file may be locked by another process.

  • Mat

Hi:

The file comm.1 has correct read/write permissions set:

-rw-r–r--

No other process writes to that file at the moment.
How can I check that the process has read permissions?

Andreas

Hi Andreas,

We’re not sure why this occurs but one engineer thinks that your NFS files system is not mounted as “hard”. If it was “soft” mounted, the NFS fileserver might be timing out.

My guess is that when you restart your simulation, either not all the previous process have died, or the file lock hasn’t been released yet. You can test this, by first running your simulation. Stop it, wait ~2 minutes, and then restart.

I have a few more emails out, but these are the best guesses so far.

  • Mat

Hi,

thanks for the suggestions. I will try these.

In the mean time, we used the Intel ifort Compiler
instead of the pgf77 Compiler, and here the problem
does not occur.

We did not try the GNU Compiler since it is no longer
maintained within the SuSE Linux distribution on our PC cluster.

Andreas

Hi Anderas,

Can you please send a report to trs@pgroup.com with instructions on how to reproduce this behavior? I’d like to have one of our engineers look into this and see if we can add any improvements.

Thanks,
Mat

I’ve have this same issue except for stdout instead of stdin. The problem goes away if you run under the SGE queing system.

Hi Bill,

We’re you able to determine the cause or the orginal error?

Thanks,
Mat