F90 slow to read files

Hello everyone,

I am working with a F90 code, with the ultimate goal of porting it to run on GPUs. I am actively working on it, but in the meantime I noticed the following.

The code reads data from about 60 files. When compiled with gfortran 9.3.0 (with option -O3) the time to read the files is ~4.4 sec. When compiled with pgfortran 19.10 community edition (with option -O4), the time to read the same files is ~9.5 sec.

I am wondering why there is such a large difference between the two compilers and what I could do to improve this part of the program (ideally for both compilers).

Thank you in advance for any help you can provide!

Difficult to tell with out a reproducing example. We’re just using the system’s stdio so the performance difference would be there.

Though, one possibility is I/O buffer size being used and that stdio defaults to using blocking I/O. You might try overriding these via calls to the system’s setvbuf function which we provide Fortran interfaces to. See: https://docs.nvidia.com/hpc-sdk/compilers/fortran-ref-guide/index.html#setvbuf

Thank you for answering. I tried to use setvbuf and set the buffer size to the size of each file read (7077888 bytes) but there has been no difference in execution time.

I tried to create a minimal example by extracting only the code that reads the files. I hope that with the code more help can be provided. The naming scheme of the input files is file_00_00_00.dat, file_00_00_01.dat, …, file_02_03_04.dat.

  PROGRAM TEST_INPUT

  IMPLICIT NONE

  INTEGER      X, Y, Z, MAX_X, MAX_Y, MAX_Z, ISOUR, IR, IT, ITIM
  INTEGER*8    RATE, START_TIME, END_TIME
  CHARACTER*20 FILENAME
  REAL         TIME, W

  DIMENSION W(-99:8291, 21, 3, 6, 100)

  CHARACTER*2 CHR(100)

  DATA CHR/'00', '01', '02', '03', '04', '05', '06', '07', '08', '09', &
           '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', &
           '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', &
           '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', &
           '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', &
           '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', &
           '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', &
           '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', &
           '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', &
           '90', '91', '92', '93', '94', '95', '96', '97', '98', '99'/

  CALL SYSTEM_CLOCK(COUNT_RATE = RATE)

  MAX_X = 3
  MAX_Y = 4
  MAX_Z = 5

  WRITE (*, *) 'Reading input files'

  CALL SYSTEM_CLOCK(START_TIME)

  DO X = 1, MAX_X
     DO Y = 1, MAX_Y
        DO Z = 1, MAX_Z
           ISOUR = (X - 1) * MAX_Y * MAX_Z + (Y - 1) * MAX_Z + Z

           FILENAME = 'file'//'_'//CHR(X)//'_'//CHR(Y)//'_'//CHR(Z)//'.dat'

           OPEN (100, FORM = 'unformatted', FILE = FILENAME)

           DO IR = 1, 6
              DO IT = 1, 6
                 DO ITIM = 1, 8192
                    READ (100) TIME, W(ITIM, IR, 1, IT, ISOUR), W(ITIM, IR, 2, IT, ISOUR), W(ITIM, IR, 3, IT, ISOUR)
                 ENDDO
              ENDDO
           ENDDO

           CLOSE (100)

        ENDDO
     ENDDO
  ENDDO

  CALL SYSTEM_CLOCK(END_TIME)

  WRITE (*, *) 'Time required: ', REAL(END_TIME - START_TIME)/REAL(RATE)

  END PROGRAM TEST_INPUT

Thanks venetis. I was able to recreate the performance issue by modifying your code to first write out the 60 files and then time how long it took read them back in. I have filed an issue report, TPR #28801, and sent to our compiler engineers for further evaluation.

Thank you! Is it possible for me to somehow follow any solution that will be provided? If not, I would really appreciate it if you could come back and comment on this thread.

No, at least not with our team’s older TPR bug tracking system. Though you’re welcome to post here and I can provide status.

Dear Mat,

Just wanted to ask whether we have any news on this.

Thanks!

Looks to be a back-end issue with the LLVM run-time. Forwarded it to the community for investigation.