We are a group of students from Technical University of Sofia, Bulgaria. We are trying to run the NEMO Ocean project on CUDA by using the PGI Accelerator placing directives in the .F90 fortran sources. We have nested “for” loops which we want to send to the CUDA cores for processing.
- The directives we are using (a code snippet containing nested loops):
!$acc kernels loop
DO jk = 2, jpkm1
DO jj = 2, jpjm1
DO ji = fs_2, fs_jpim1 ! vector opt.
zwt(ji,jj,jk) = zwt(ji,jj,jk) + ah_wslp2(ji,jj,jk)
2. We have compiled the file containing the loops with command:
pgfortran -Mcuda -c file.F90.
The files which we include (USE in Fortran source) we compile to .mod with the following command: pgfortran -shared -fPIC include_file.F90
At this point we have the “file.o” + “file.mod”.
After step 2 and 3 we compile NEMO Ocean with the following configuration file by USING gfortran compiler:
%NCDF_LIB -L/usr/local/netcdf/lib -lnetcdf -lnetcdff
%FCFLAGS -I/usr/include/mpich2 -L/usr/lib/mpich2/lib -lmpich -lmpl -fdefault-real-8 -O3 -funroll-all-loops -fcray-pointer
%LDFLAGS -I/usr/include/mpich2 -L/usr/lib/mpich2/lib -lmpich -lmpl -L/usr/local/netcdf/lib -lnetcdf -lnetcdff
%FPPFLAGS -I/usr/include/mpich2 -L/usr/lib/mpich2/lib -lmpich -lmpl -P -C -traditional
Note: The command we use to compile NEMO is:
./makenemo -m gfortran_linux_modify -r ORCA2_LIM -n MY_ORCA2_LIM -j 4
After the successful compilation of nemo a BUILD folder is generated. The folder contains all the “.o” and “.mod” files of NEMO.
Then we replace the successfuly compiled files which contain the nested for loops with the .o and .mod files produced by pgfortran (from step 3).
We comment the lines of NEMO Ocean’s Make file so they are not being generated when we try to compile with the “modified” (step 3) files.
We compile assuring (step 7) that the new .o files will be used by pgfortran on new compilation. The following ERROR appears:
lib__fcm__nemo.a(trazdf.o): In function
__trazdf_MOD_tra_zdf': trazdf.F90: (.text + 0x15f2): Undefined reference to __trazdf_imp_MOD_tra_zdf_imp
trazdf.F90: (.text + 0x1818): Undefined reference to `__trazdf_imp_MOD_tra_zdf_imp
Collect2: error: ld returned 1 exist status
fcm_internal load failed (256)
We would like to know if this is the right way (logically) to execute a separated fortran code (step 3) on the CUDA cores ?
Is our approach correct i.e.:
Compile only the files containing Accelerator directives with pgfortran, or compile the entire NEMO Ocean project with pgfortran ?
Technical University of Sofia,
Team of students
P.S. The university intends to buy a license for pgfortran if we manage to complete the project. We are using trial versions for now.