Hello .
With pgi/13.2 I am experimented some memory leak problem with OpenAcc ( PgiAcc )directive code .
The problem was not present on pgi/12.10 or pgi/13.1
( in this test case I’ve used Cuda5.0 but it’s the same with Cuda4.2 )
As an example, me now favorite test case, the “acc_f3.f90” example coming with the pgi/13.2 compiler …
As it a acc/cuda/memory leak problem, I’ve had a loop on the code around the 2 smooth call :
escj@aeropc107:~/dir_PGF/PGI_HOME/linux86-64/13.2/etc/samples/openacc> diff acc_f3.f90 acc_f3_loop.f90
128a129
> do it=1,10
135a137
> enddo
less acc_f3_loop.f90
...
do it=1,10
call system_clock( count=c1 )
call smooth( aa, bb, w0, w1, w2, n, m, iters )
call system_clock( count=c2 )
cgpu = c2 - c1
call smoothhost( aahost, bbhost, w0, w1, w2, n, m, iters )
call system_clock( count=c3)
chost = c3 - c2
enddo
With pgi/13.1 OK
pgfortran -o acc_f3_loop.uni acc_f3_loop.f90 -acc -O0
cuda-memcheck acc_f3_loop.uni
========= CUDA-MEMCHECK
0 errors found
1122952 microseconds on GPU
339374 microseconds on host
========= ERROR SUMMARY: 0 errors
With pgi/13.2 PROBLEMS
pgfortran --version
pgfortran 13.2-0 64-bit target on x86-64 Linux -tp nehalem
cuda-memcheck acc_f3_loop.uni
========= CUDA-MEMCHECK
0 errors found
1133705 microseconds on GPU
337319 microseconds on host
========= Program hit error 712 on CUDA API call to cuMemHostRegister
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib64/libcuda.so (cuMemHostRegister + 0x1ea) [0xd3e6a]
========= Host Frame:acc_f3_loop.uni [0xf6c0]
...
========= Program hit error 712 on CUDA API call to cuMemHostRegister
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib64/libcuda.so (cuMemHostRegister + 0x1ea) [0xd3e6a]
========= Host Frame:acc_f3_loop.uni [0xf6c0]
=========
========= ERROR SUMMARY: 18 errors
Error 712 is :
CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED = 712
In my own code the memory leak end by a fatal error in cuMemcpy2DAsync ( again working with pgi/12.10 & pgi/13.1 )
+ MESONH-LXpgiI4-MNH-V4-9-4-0-ACC_PGI_GOODDIR2_NUWA-MPIAUTO-CUDA_DB
hello word form rank= 0 iproc= 1
...
call to cuMemcpy2DAsync returned error 1: Invalid value
A+
Juan