hi
my compiler version is 64bits 11.7 PGI.Acc.Fortran, and when i do with cuda+mpi work on 64 bits workstations i encountered an problem.
it’s an seismic migration code, and it do many shots cycles and within every shots thousands of timesteps has to calculate. its the background of the code.
at first i checked the program to calculate only 10 timesteps or hunderds timesteps, and it done. but when i give an real calculate timestep about 6000 value, which makes the calculation time is long and the error happened:
killed by single 2
p0_31083: p4_error: net_recv read: probable EOF on socket: 1
p0_31083: (33208.406250) net_send: could not write to fd=4, errno = 32
*=============
the command line is
pgfortran -Mcuda -Mmpi -o mpi mpi.f90
mpirun -np 3 mpi >a.dat&
*==============
the code is :
program RTM
use cudafor
include ‘mpif.h’
here is parameters define****
call MPI_INIT(IEER)
call MPI_COMM_SIZE(MPI_COMM_WORLD, NUMPROCS, IEER)
call MPI_COMM_RANK(MPI_COMM_WORLD, MYID, IEER)
here read some files*****
ierr=cudaGetDeviceCount(numdev)
ierr=cudasetdevice(myid)
call subroutine cal(parameters)
call MPI_FINALIZE(IEER)
end
subroutine cal(parameters)
use cudafor
include ‘mpif.h’
here is some parameters’ calculation
do ishots=1+myid,nshots,numprocs (shots cycle)
do it=1,max_timesteps
call gpu subroutines
host array = device array
enddo
write the result to the disk
enddo
end subroutine
*===========================
thanks, if someone can help to solve the problem.