I am working on porting a large science code (MFIX) to GPUs using pgf90 (v13.1) and OpenACC. For a portion of the code, kernel code is not generated because I suspect a subroutine within it is not being inlined. The compiler also does not provide any details why it is not being inlined (assuming that is true).
The subroutine is (in file des_functions.f)
SUBROUTINE DES_CROSSPRDCT_3D(AA, XX,YY) IMPLICIT NONE DOUBLE PRECISION AA(3), XX(3), YY(3) AA(1) = XX(2)*YY(3) - XX(3)*YY(2) AA(2) = XX(3)*YY(1) - XX(1)*YY(3) AA(3) = XX(1)*YY(2) - XX(2)*YY(1) RETURN END SUBROUTINE DES_CROSSPRDCT_3D
The call to it is (Line 469 in calc_force_des.f):
... 468 IF(DIMN.EQ.3) THEN 469 CALL DES_CROSSPRDCT_3D(V_ROT, OMEGA_SUM, NORMAL) 470 ELSE ...
The compilation command is
pgf90 -O -Mdalign -acc -ta=nvidia,time -Minfo=inline,accel -Mipa=inline -Munixlogical -c -I. -Mnosave -Mfreeform -Mrecursive -Mreentrant -byteswapio -Minline=name:des_crossprdct_2d,name:des_crossprdct_3d ./des/calc_force_des.f
Excerpts of the compiler output:
PGF90-W-0155-Accelerator region ignored; see -Minfo messages (./des/calc_force_des.f: 404)
404, Accelerator region ignored
414, Accelerator restriction: function/procedure calls are not supported
469, Accelerator restriction: unsupported call to ‘des_crossprdct_3d’
Note: Line 414 is the beginning of the DO LOOP that contains the above subroutine call.
I can provide more details/access to the code as necessary.
Thanks very much in advance for the help