I’m working on a fortran code which uses MPI. It is also written in a style which often explicitly states the bounds of arrays by saying ARRAY(LowerBound:UpperBound). It even does this in MPI calls. I realise that MPI might not handle sliced arrays properly (MPI_SUBARRAYS_SUPPORTED = F), however the code works fine with gfortran and cray-fortran but fails with nvfortran. I believe the reason it is working with the other compilers is that the slice is just the whole array, and the compiler recognises this and just treats it as the whole array being passed in. This also seems to work with nvfortran if the array is 1d but in the 2d case it fails.
I have a simplified reproduction. The wait says the non-blocking receive has completed but the receiving array hasn’t been overwritten with the sent values:
MODULE MPIINFO
USE MPI
IMPLICIT NONE
INTEGER:: N !THE NUMBER OF RANK THAT I HAVE(FOR EACH PROCESSOR!
INTEGER:: ISIZE !THE TOTAL NUMBER OF RANKS(SIZE OF)
INTEGER::IERROR, provided
INTEGER:: STATUSES(MPI_STATUS_SIZE,2)
INTEGER::EXCHANGE_PROCID
INTEGER:: DIM1, DIM2
INTEGER, ALLOCATABLE, DIMENSION(:, :)::SOL_SEND !ARRAY TO HOLD THE VALUES TO BE SEND
INTEGER, ALLOCATABLE, DIMENSION(:, :)::SOL_RECV !ARRAY TO HOLD THE VALUES TO BE RECEIVED
CONTAINS
SUBROUTINE EXCHANGE_SOL()
IMPLICIT NONE
INTEGER:: T
integer:: n_requests
integer, dimension(2) :: requests
n_requests = 0
requests(:) = 0
n_requests = n_requests + 1
PRINT *, "Rank", N, " Sending 2D Array to ", EXCHANGE_PROCID
PRINT *,""
PRINT *, N, " Contents of send buffer: ", SOL_SEND(1:DIM1, 1:DIM2)
PRINT *,""
CALL MPI_ISEND( &
SOL_SEND(1:DIM1, 1:DIM2), &
DIM1*DIM2, MPI_INTEGER, &
EXCHANGE_PROCID, 0, &
MPI_COMM_WORLD, requests(n_requests), ierror &
)
PRINT *, "Rank", N, " Send errors: ", ierror
n_requests = n_requests + 1
CALL MPI_IRECV( &
SOL_RECV(1:DIM1, 1:DIM2), &
DIM1*DIM2, MPI_INTEGER, &
EXCHANGE_PROCID, 0, &
MPI_COMM_WORLD, requests(n_requests), ierror &
)
PRINT *, "Rank", N, " Receive errors: ", ierror
CALL MPI_WAITALL(n_requests, requests, STATUSES, ierror)
PRINT *, "Rank", N, " Wait errors: ", ierror
PRINT *, "Rank", N, " Contents of received buffer", SOL_RECV(1:DIM1, 1:DIM2)
END SUBROUTINE EXCHANGE_SOL
END MODULE
PROGRAM MPI_2D_SLICED_NONBLOCKING_TEST
USE MPIINFO
IMPLICIT NONE
INTEGER:: J, K
DIM1 = 2
DIM2 = 2
CALL MPI_INIT_THREAD(MPI_THREAD_FUNNELED, PROVIDED, IERROR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, ISIZE, IERROR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, N, IERROR)
ALLOCATE (SOL_SEND(DIM1, DIM2))
ALLOCATE (SOL_RECV(DIM1, DIM2))
IF (N .eq. 0) THEN
EXCHANGE_PROCID = 1
END IF
IF (N .eq. 1) THEN
EXCHANGE_PROCID = 0
END IF
DO J = 1, DIM1
DO K = 1, DIM2
SOL_RECV(J, K) = -J - K
SOL_SEND(J, K) = J + K
END do
END DO
CALL EXCHANGE_SOL()
CALL MPI_BARRIER(MPI_COMM_WORLD, IERROR)
CALL MPI_FINALIZE(IERROR)
END PROGRAM MPI_2D_SLICED_NONBLOCKING_TEST
Output from nvfortran compiled code (between two mpi processes):
Rank 0 Sending 2D Array to 1
Rank 0 Contents of send buffer: 2 3 3
4
Rank 0 Send errors: 0
Rank 0 Receive errors: 0
Rank 0 Wait errors: 0
Rank 0 Contents of received buffer -2 -3
-3 -4
Rank 1 Sending 2D Array to 0
Rank 1 Contents of send buffer: 2 3 3
4
Rank 1 Send errors: 0
Rank 1 Receive errors: 0
Rank 1 Wait errors: 0
Rank 1 Contents of received buffer -2 -3
-3 -4
Expected output (gfortran compiled, between 2 mpi processes):
Rank 0 Sending 2D Array to 1
Rank 0 Contents of send buffer: 2 3 3 4
Rank 1 Sending 2D Array to 0
Rank 1 Contents of send buffer: 2 3 3 4
Rank 0 Send errors: 0
Rank 0 Receive errors: 0
Rank 1 Send errors: 0
Rank 1 Receive errors: 0
Rank 0 Wait errors: 0
Rank 0 Contents of received buffer 2 3 3 4
Rank 1 Wait errors: 0
Rank 1 Contents of received buffer 2 3 3 4
If I remove the lowerboud:upperbound slice in the mpi calls it works. Obviously, I can go through the code and remove every unnecessary slice from the code’s mpi calls. It is just a slight annoyance that it works with the other compilers but not nvfortran.