Hello,
I would like to improve performances of my code using parallelization of a loop. The loop’s data input are common for all loop cycles but the computations are clearly independent for each cycle. I think it is perfectly convenient for parallelization, though this loop calls a function which itself calls other subroutines and functions in chain but automatic inlining allows to make it parallelizable (at least to some extent, see below). The output of the loop is a 1D array of ‘nf’ complex numbers (‘nf’ being the number of cycles of the loop) which I want to finally export to a text file (separating real and imaginary parts of the complex number) for further processing with other softwares.
Here is the part of the code containing the loop I intend to parallelize:
273 !call acc_init(acc_device_nvidia)
274 !$acc region
275 !$acc do kernel, parallel, independent, private(k)
276 do k=1,nf
277 GxxV(k)=Gxxf(k,nl,d,c0,pi,fmin,df,eps,mu,sigma,alpha,dz,rho,theta,eta,zeta)
278 enddo
279 !$acc end region
280 open(unit=1,file='plmexjxFGPU64.out',status='replace')
281 do l=1,nf
282 write(1,*) real(GxxV(l)),imag(GxxV(l))
283 enddo
284 close(1)
The compiler command I used:
pgfortran plmexjx.f90 zbsubs.f machine.f90 -ta=nvidia -Minfo=accel -Minline,reshape -Mipa=inline,reshape
The compiler messages I got from it (Minfo=accel messages only):
plmexjx.f90:
zbsubs.f:
machine.f90:
plmexjx.f90:
PGF90-W-0155-Accelerator region ignored; see -Minfo messages (plmexjx.f90: 274)
plmexjx:
274, Accelerator region ignored
276, Accelerator restriction: function/procedure calls are not supported
277, Accelerator restriction: function/procedure calls are not supported
0 inform, 1 warnings, 0 severes, 0 fatal for plmexjx
zbsubs.f:
machine.f90:
IPA: no IPA optimizations for 1 source files
IPA: Recompiling plmexjx.obj: new IPA information
plmexjx:
274, Accelerator region ignored
276, Accelerator restriction: size of the GPU copy of an array depends on values computed in this loop
277, Accelerator restriction: function/procedure calls are not supported
277, Accelerator restriction: : array accessed with too many dimensions, possibly due to inlining: ..inline
Accelerator restriction: : array accessed with too many dimensions, possibly due to inlining: real(..inline)
Accelerator restriction: : array accessed with too many dimensions, possibly due to inlining: imag(..inline)
Accelerator restriction: size of the GPU copy of '..inline' is unknown
Accelerator restriction: array accessed with too many dimensions, possibly due to inlining
Accelerator restriction: size of the GPU copy of an array depends on values computed in this loop
Accelerator restriction: one or more arrays have unknown size
277, Accelerator restriction: size of the GPU copy of '..inline' is unknown
Accelerator restriction: loop has multiple exits
Accelerator restriction: one or more arrays have unknown size
Accelerator restriction: array accessed with too many dimensions, possibly due to inlining
0 inform, 1 warnings, 0 severes, 0 fatal for plmexjx
IPA: Recompiling zbsubs.obj: new IPA information
I think several problems occured at compilation but I have difficulties in determining their origin:
- Apparently some inlining problems are still remaining but I cannot locate them because the list of ‘-Minfo=inline’ messages is so long that I cannot access the first messages in the compiler command window. Is there a way to export compilation messages to a text file…?
For this, I just saw your reply to my previous post. I am using command line shell PGI Workstation 12.4(64) on Windows. I tried what you suggested but did not succeed, but may be I did not catch the idea… I typed:
> pgfortran plmexjx.f90 zbsubs.f machine.f90 -ta=nvidia -Minfo=accel -Minline,reshape -Mipa=inline,reshape
and then:
> & logfile.txt
Is it correct?
I suspect these inlining problems to occur from the presence of GO TO statements in the last procedure called in the chain, which would cause ‘loop multiple exits’ (?). But this is strange to me because these GO TO statements only point inside the procedure (do not produce exit from it), and sequentially in the loop cycles
-
Due to inlining, I also have apparently problems with the output variable of the loop (GxxV), notably when I want to write the real and imaginary parts in the output text file. I tried to solve it including the ‘independent’ clause, but unsuccessfully. What is it due to?
-
One or more arrays have unknown size??? Though all array sizes have been explicitely allocated at variable declaration, before the loop
-
Array accessed with too many dimensions, possibly due to inlining… What does it mean, and how to solve it?
Thanks a lot in advance for your help.
Fred