Auto Parallelisation

I compiled my code with Auto-Parallelization in PVF (enabled in PVF’s property window Fortran | Optimization | Auto-Parallelization Yes) successfully.
When I run the code, it crashed and I tried to trace down where it crashed by outputting some information.
It is found that it crashed at some loops (not the all loops).
Strangely, if I put an output statement inside the first trouble loop, then it can be completed. But the crash moves to a following loop.
If I repeat this on the loop, the crash moves to next one.

My code have many loops, and it is not practical to put the output statements in all the loops. After all, this should not the case at all.
Does someone know the reason?

Hi Jingyu Shi,

Unfortunately, there’s not enough information here to give you any good advice. The crash could be caused by any number of things, a stack overflow, a memory issue (for example an out-of-bounds error), a race conditions, compiler error, etc.

What I’d suggest is compile with “-gopt” and then run the program through the debugger. It might give us a bit more information about why it’s crashing. You can then also post the section of code where the crash occurred. This may help narrowing down the issue.

I’m guessing that by adding print statements, you inhibit some optimization thus allowing it to succeed.

  • Mat

Hi Mat,

Following your suggestion, the following is the message shown on Debug output window.


‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Windows\SYSTEM32\ntdll.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Windows\system32\kernel32.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Windows\system32\KERNELBASE.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Users\shi081\Documents\DDBEM\Codes\PortlandVirsualFortran\FRACOD3D_PVFDLL_CALL\FRACOD3D_PVFDLL_CALL\Win32\Debug\FRACOD3D_PVFDLL.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘c:\program files\pgi\win32\15.3\bin\pgf90.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘c:\program files\pgi\win32\15.3\bin\pgf90_rpm1.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘c:\program files\pgi\win32\15.3\bin\pgc.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Windows\system32\MSVCR100.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘C:\Windows\system32\msvcr120.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘c:\program files\pgi\win32\15.3\bin\pgftnrtl.dll’, No symbols loaded.
‘FRACOD3D_PVFDLL_CALL.exe’: Loaded ‘c:\program files\pgi\win32\15.3\bin\pgf90rtl.dll’, No symbols loaded.
The thread ‘0.0’ (0x2b34) has exited with code 0 (0x0).
The program : PGI Debug Engine’ has exited with code 0 (0x0).



The following is one section of the code where it crashes, which is inside of the DLL FRACOD3D_PVFDLL.dll called by FRACOD3D_PVFDLL_CALL.exe


SUBROUTINE INITIALIZEGRIDPOINT
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
INCLUDE ‘PARAMETERS.INC’ ! contains value of NGRIDPOINTS of 10000

INTEGER NNODE,NELEM,NGRIDPOINT,NBELEM,NCRACKELEM,NCRACK,KINITIAL,KGINITIAL
COMMON/GEOMETRY/NNODE,NELEM,NGRIDPOINT,NBELEM,NCRACKELEM,NCRACK,KINITIAL,KGINITIAL

INTEGER NDE(NELEMS,7)
COMMON/GLOBALNODE/NDE
DOUBLE PRECISION XGRID(NGRIDPOINTS),YGRID(NGRIDPOINTS),ZGRID(NGRIDPOINTS) ! position of grid point
INTEGER KGRID(NGRIDPOINTS),MGRID(NGRIDPOINTS),NELEMGRID ! MGRID(*) material of the grid point
COMMON/GRIDPOINT/XGRID,YGRID,ZGRID,KGRID,MGRID,NELEMGRID

DOUBLE PRECISION ECENT(NELEMS,3) !Coordinates of “centre” of the element 1<=IE<=NELEMS
COMMON/TRICENTRE/ECENT

DOUBLE PRECISION XGRID0(NGRIDPOINTS),YGRID0(NGRIDPOINTS),ZGRID0(NGRIDPOINTS)
INTEGER KGRID0(NGRIDPOINTS),MGRID0(NGRIDPOINTS)
INTEGER IE,IGD,IGP2,NGRIDPOINT00


DO 100 IE=1,NELEM
IGP2=IE*2
XGRID0(IGP2-1)=ECENT(IE,1)
YGRID0(IGP2-1)=ECENT(IE,2)
ZGRID0(IGP2-1)=ECENT(IE,3)
KGRID0(IGP2-1)=1
MGRID0(IGP2-1)=NDE(IE,6)

XGRID0(IGP2)=ECENT(IE,1)
YGRID0(IGP2)=ECENT(IE,2)
ZGRID0(IGP2)=ECENT(IE,3)
KGRID0(IGP2)=-1
MGRID0(IGP2)=NDE(IE,6)
!WRITE(16,*) ‘Grid point’, IGP2 !this statement will let the code pass through this loop
100 CONTINUE

! Two more loops follow here

END SUBROUTINE

Hi Jingyu Shi,

Does this program work without Auto-parallelization?

While I’m not an expert in programming DLLs, I’m thinking that this program should fail even without any optimization. DLLs are self contained, hence Common blocks need to be exported and/or imported in order to be shared between the main program and the DLL.

See section 12.3.1 of the PVF User’s Guide: http://www.pgroup.com/doc/pvfug.pdf

One thing to try, is to not use a DLL and instead compile all the code together in one binary.

If the program does work as is without Auto-parallel, then my next best guess is that you’re exceeding the stack limit. You can check this by compiling with the “-Mchkstk” flag (in PVF look for the “check stack” option in the properties) and then setting the environment variable PGI_STACK_USAGE=1 before running the program.

Hope this helps,
Mat

Hi Mat,

The program runs fine without parallel computing, compiled with both PGI and other company’s compilers. All the common blocks and computations are inside the DLL, which have many subroutines; the main or executable program is only used for starting running test of PGI compiler (I am new in PGI compilers). Late, I have to link the DLL to another program written in C, so more problems will arise again.

I misunderstood -Bstatic from the very start creation of project for the executable in PVF and -Bdynamic option. I thought -Bdynamic should not be used if -Bstatic exists (which cannot be deleted in PVF property window). So I did not use -Bdynamic option in compilation and linker for the executable. When -Bdynamic option is used, the problem disappeared and is runs fine.

I do not know why it runs fine without -Bdynamic option when parallel computing is not enabled.

Thanks very much for the help.

Jingyu