Run time error for f2py wrapped fortran

Hye all,
I’m doing a cross posting between f2py, python and fortran developpers about my problem.
I’m running some fortran subroutine from a python script (the fortran is wrapped with f2py, but compiled with pgi6.0). My environment is a amd64_x86 linux 2.4 cluster (opteron) with pgi6.0 and libc.so.6

sometime, write(,) in my fortran code will produce the following run-time error :

PGFIO-F-202/list-directed write/unit=6/conflicting specifiers.
File name = stdout formatted, sequential access record = 1
In source file module2/module2.f, at line number 2

This works very fine when building a fortran binary

The problem appears when using pgf90 but not with pgf77.

any help ???

thanks a lot
labrach

I’ve used f2py. I will try it myself and let you know what I find out.
You say sometimes the error occurs, but othertimes not? What types of data are you writing when the error occurs?

I got to this sooner than I thought. Works for me…

Here is hello.f. Also copied to testhello.f90

subroutine shello(a)
integer a
print ,“Hello from Fortran!”
print ,"a = ",a
write (
,
) “Hi * *”
end

brentl@leback:~/simple> f2py --fcompiler=pg -c hello.f -m hello
numpy_info:
FOUND:
define_macros = [(‘NUMERIC_VERSION’, ‘“\“23.3\””’)]
include_dirs = [‘/usr/include/python2.3’]

running build
running config_fc
running build_src
building extension “hello” sources
f2py:> /tmp/tmpspkPQh/src/hellomodule.c
creating /tmp/tmpspkPQh
creating /tmp/tmpspkPQh/src
Reading fortran codes…
Reading file ‘hello.f’
Post-processing…
Block: hello
Block: shello
Post-processing (stage 2)…
Building modules…
Building module “hello”…
Constructing wrapper function “shello”…
shello(a)
Wrote C/API module “hello” to file “/tmp/tmpspkPQh/src/hellomodule.c”
adding ‘/tmp/tmpspkPQh/src/fortranobject.c’ to sources.
adding ‘/tmp/tmpspkPQh/src’ to include_dirs.
copying /usr/lib64/python2.3/site-packages/f2py2e/src/fortranobject.c → /tmp/tm
pspkPQh/src
copying /usr/lib64/python2.3/site-packages/f2py2e/src/fortranobject.h → /tmp/tm
pspkPQh/src
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
customize PGroupFCompiler
customize PGroupFCompiler using build_ext
building ‘hello’ extension
compiling C sources
gcc options: ‘-pthread -fno-strict-aliasing -DNDEBUG -O2 -fmessage-length=0 -Wal
l -fPIC’
creating /tmp/tmpspkPQh/tmp
creating /tmp/tmpspkPQh/tmp/tmpspkPQh
creating /tmp/tmpspkPQh/tmp/tmpspkPQh/src
compile options: ‘-I/usr/include/python2.3 -I/tmp/tmpspkPQh/src -I/usr/include/p
ython2.3 -c’
gcc: /tmp/tmpspkPQh/src/fortranobject.c
/tmp/tmpspkPQh/src/fortranobject.c: In function `fortran_doc’:
/tmp/tmpspkPQh/src/fortranobject.c:123: warning: int format, different type arg(
arg 3)
gcc: /tmp/tmpspkPQh/src/hellomodule.c
compiling Fortran sources
pgf77(f77) options: ‘-fpic -Minform=inform -Mnosecond_underscore -fast’
pgf90(f90) options: ‘-fpic -Minform=inform -Mnosecond_underscore -fast’
pgf90(fix) options: ‘-Mfixed -fpic -Minform=inform -Mnosecond_underscore -fast’
compile options: ‘-I/usr/include/python2.3 -I/tmp/tmpspkPQh/src -I/usr/include/p
ython2.3 -c’
pgf77:f77: hello.f
pgf90 -shared -fpic /tmp/tmpspkPQh/tmp/tmpspkPQh/src/hellomodule.o /tmp/tmpspkPQ
h/tmp/tmpspkPQh/src/fortranobject.o /tmp/tmpspkPQh/hello.o -o ./hello.so
Removing build directory /tmp/tmpspkPQh

brentl@leback:~/simple> python
Python 2.3.4 (#1, Feb 7 2005, 15:05:26)
[GCC 3.3.4 (pre 3.3.5 20040809)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import hello
dir(hello)
[‘doc’, ‘file’, ‘name’, ‘version’, ‘as_column_major_storage’, ‘h
as_column_major_storage’, ‘shello’]
hello.shello(3)
Hello from Fortran!
a = 3
Hi * *
quit
‘Use Ctrl-D (i.e. EOF) to exit.’



Then I tried it with the f90 file…

brentl@leback:~/simple> f2py --fcompiler=pg -c testhello.f90 -m testhello
numpy_info:
FOUND:
define_macros = [(‘NUMERIC_VERSION’, ‘“\“23.3\””’)]
include_dirs = [‘/usr/include/python2.3’]

running build
running config_fc
running build_src
building extension “testhello” sources
f2py:> /tmp/tmpPdSqjM/src/testhellomodule.c
creating /tmp/tmpPdSqjM
creating /tmp/tmpPdSqjM/src
Reading fortran codes…
Reading file ‘testhello.f90’
Post-processing…
Block: testhello
Block: shello
Post-processing (stage 2)…
Building modules…
Building module “testhello”…
Constructing wrapper function “shello”…
shello(a)
Wrote C/API module “testhello” to file “/tmp/tmpPdSqjM/src/testhellomodu
le.c”
adding ‘/tmp/tmpPdSqjM/src/fortranobject.c’ to sources.
adding ‘/tmp/tmpPdSqjM/src’ to include_dirs.
copying /usr/lib64/python2.3/site-packages/f2py2e/src/fortranobject.c → /tmp/tm
pPdSqjM/src
copying /usr/lib64/python2.3/site-packages/f2py2e/src/fortranobject.h → /tmp/tm
pPdSqjM/src
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
customize PGroupFCompiler
customize PGroupFCompiler using build_ext
building ‘testhello’ extension
compiling C sources
gcc options: ‘-pthread -fno-strict-aliasing -DNDEBUG -O2 -fmessage-length=0 -Wal
l -fPIC’
creating /tmp/tmpPdSqjM/tmp
creating /tmp/tmpPdSqjM/tmp/tmpPdSqjM
creating /tmp/tmpPdSqjM/tmp/tmpPdSqjM/src
compile options: ‘-I/usr/include/python2.3 -I/tmp/tmpPdSqjM/src -I/usr/include/p
ython2.3 -c’
gcc: /tmp/tmpPdSqjM/src/fortranobject.c
/tmp/tmpPdSqjM/src/fortranobject.c: In function `fortran_doc’:
/tmp/tmpPdSqjM/src/fortranobject.c:123: warning: int format, different type arg(
arg 3)
gcc: /tmp/tmpPdSqjM/src/testhellomodule.c
compiling Fortran sources
pgf77(f77) options: ‘-fpic -Minform=inform -Mnosecond_underscore -fast’
pgf90(f90) options: ‘-fpic -Minform=inform -Mnosecond_underscore -fast’
pgf90(fix) options: ‘-Mfixed -fpic -Minform=inform -Mnosecond_underscore -fast’
compile options: ‘-I/usr/include/python2.3 -I/tmp/tmpPdSqjM/src -I/usr/include/p
ython2.3 -c’
pgf90:fix: testhello.f90
pgf90 -shared -fpic /tmp/tmpPdSqjM/tmp/tmpPdSqjM/src/testhellomodule.o /tmp/tmpP
dSqjM/tmp/tmpPdSqjM/src/fortranobject.o /tmp/tmpPdSqjM/testhello.o -o ./testhell
o.so
Removing build directory /tmp/tmpPdSqjM
brentl@leback:~/simple> python
Python 2.3.4 (#1, Feb 7 2005, 15:05:26)
[GCC 3.3.4 (pre 3.3.5 20040809)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import testhello
testhello.shello(3)
Hello from Fortran!
a = 3
Hi * *

Importing only one module works also for me.
Try importing two modules
ex:
f2py -c testhello.f90 -m testhello1
f2py -c testhello.f90 -m testhello2

then
import testhello1
import testhello2
testhello1.shello()
testhello2.shello()

I’ve got the erron in these case
thanks

I get the same error too. I will have to see whether it is something f2py is doing or something pgf90 is doing.

Great !!
I’m not alone in this world …
I’m also searching, but it’s difficult to debug
thank’s a lot

(you can alert me at labrach@free.fr)

laurent

If I just generate two .so files the normal way from the fortran subroutines which become python modules, and I link them into a fortran main program,
I don’t get an error. I guess the next step is to try to link these two .so files into a gcc main program. Do you see anywhere in the python linkage code where it is trying to redefine stdout?

Yes, so I can generate two .so files using our pgf90 compiler. I can then
link them into a main program compiled and linked with gcc. I do not get
any error. Therefore, I suspect the problem has to be something do with the python runtime. I didn’t see anything in the python linkage code that mucks with stdout. Could there be something in the 2nd “import” statement that is causing this?

The ‘normal’ compile/link/exec mode with pgf90 works fine also for me.
The problems only occurs when .so are imported into the python interpreter.

Do you see anywhere in the python linkage code where it is trying to redefine stdout?

The bug also appears with other IO statement, for example let’s put this in the second module :
open(1,file=‘test.txt’)
write(1,*) ‘Hello’
close(1)

and the message is in this case :
PGFIO-F-201/OPEN/unit=1/illegal value for specifier.
File name = test.txt
In source file module2.f90, at line number 2

And this works if I only import the module2

There is something changed at the second import …

I’ve also tried to debug the python interpreter, bur I don’t know what to monitor …


Please brent, keep contact because I feel lost …
and thank’s for your help

laurent
labrach@free.fr

I was able to simplify the problem down to this code segment:

int main()
{
int i;
dl_funcptr p;
void *handle;
i = 4;
handle = dlopen(“./shello.so”, RTLD_NOW);
if (handle == NULL) {
printf(“shello.so not there %s\n”,dlerror());
return -1;
}
p = (dl_funcptr) dlsym(handle,“shello_”);
(*p)();
handle = dlopen(“./thello.so”, RTLD_NOW);
if (handle == NULL) {
printf(“thello.so not there %s\n”,dlerror());
return -1;
}
p = (dl_funcptr) dlsym(handle,“thello_”);
(*p)();
}

When python imports the f2py created .so files, it uses the dlopen call.
The flag argument it uses is 2, for RTLD_NOW. What happens is that the 2nd call to dlopen cannot access the global symbols pulled in from the first dlopen call. Found this on the web:

flag must be either RTLD_LAZY, meaning resolve undefined symbols as code from the dynamic library is executed, or RTLD_NOW, meaning resolve all undefined symbols before dlopen returns, and fail if this cannot be done. Optionally, RTLD_GLOBAL may be or’ed with flag, in which case the external symbols defined in the library will be made available to subsequently loaded libraries.

When I change the above calls to dlopen to or in RTLD_GLOBAL, everything works fine. The place where this needs to change in the Python source is dynload_shlib.c, line 129 (In python 2.3.4).

An easy way to do this from python without recompiling:

import sys
sys.setdlopenflags(258)

That is working !!!

I can’t believe it :) I was beginning to lose confidence …

Now, why this only appears with f90 PGI compiled library ??
It works fine with solaris,hp-ux,irix64,intel linux commercial compilers.

Do we need to report this to Python staff ??

Anyway, I can easily fix the problem with the import sys method

Thank’s very much brentl for your help

laurent

We are still investigating why it happens. Or, why it works with other compilers. Given the explanation of RTLD_GLOBAL, I’m not sure why that is not always needed. So, we need to dig a little more. But, at least you have a work-around for now.

If you turn on debugging, you will see that when you do the first python import, all of our pgi runtime .so files get loaded. When you do the second python import, dlopen knows that the pgi runtime does not need to be reloaded. But how does the second f90 subroutine access the pgi runtime routines? It obviously does, as we get a pgi f90 runtime error. But, something is not right, either uninitialized data or data is not shared properly.

So, we will continue to investigate, and pass it onto our runtime engineers.

If your f90 compilers share i/o runtime with the compiler used to compile python, there will most likely never be problems of this sort. So, that might be why at least some of the solaris, hp-ux, irix etc. are okay.

Recently we have switched to using the f2py that comes with numpy and encountered lots of problems with the sys.setdlopenflags(258) work around.

As soon as we call “sys.setdlopenflags(258)” we are no longer able to import any *.so files built with the numpy included f2py. If we try we get a “Segmentation fault” and python terminates.

Here is the error in full: (fsim.so is f2py wrapped pgf90 compiled code)

[bflynt@master hiarms_pyrcas]# python
Python 2.4.2 (#1, May 2 2006, 08:28:01)
[GCC 4.1.0 (SUSE Linux)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import sys
sys.setdlopenflags(258)
import fsim
Segmentation fault
[bflynt@master hiarms_pyrcas]#

If I leave out the sys.setdlopenflags(258) the module imports fine until the code tries to write or print something. Which was the reason we all used the work around.

Has PG tried to fix this issue with the compiler? It has been almost 2 years now and still this problem exists.


Thanks,
Bryan

It will take me a little while to recreate this testing environment. In the meantime, could you try using -Bstatic_pgi on the pgi link line when you create the .so file? This might be another temporary work-around.

The real answer is that we are working to fix this in our 7.1 compiler which should be out later this year.