total/free CUDA memory: 0/0 using openacc with PGI 17.5

rpic · October 4, 2017, 3:56pm

Hi,

I am having issues with PGI version 17.5 while the code works fine with version 16.10. Hence, I am wondering if there have been some changes of the compiler that might cause that. It is reproducible on two different systems.

The crash occurs the first time openacc is used (see below). If I disable the first openacc statement it crashes at the next one and the backtrace of the __pgi_uacc functions is the same.

total/free CUDA memory: 0/0
Application 3779914 is crashing. ATP analysis proceeding...

ATP Stack walkback for Rank 0 starting:
....
  __pgi_uacc_initialize@init.c:701
  __pgi_uacc_enumerate@init.c:538
  __pgi_uacc_cuda_init@cuda_init.c:369
  __pgi_uacc_cuda_error_handler@cuda_error.c:64

The code use MPI, OpenACC, CUFFT and some CUDA kernels for important code parts. Unified memory is not used. The compiler and linking flags are

                "FFLAGS= -O2 -acc -ta=tesla:cc60,cuda8.0 -Minfo=accel -Mcuda=cuda8.0  " \
                "LFLAGS=  " \
                "LIBS= -acc -ta=tesla:cc60,cuda8.0 -Minfo=accel -Mcuda=cuda8.0  -lcufft " \

Thanks for your help,
Richard

MatColgrove · October 4, 2017, 7:06pm

Hi Richard,

There have been many improvements with the compilers between 17.5 and 16.10, though what’s causing this problem, I unfortunately don’t know.

The crash seems to be occurring when the runtime is first initializing the device but I can’t think of any change in the compiler that would cause this.

Can you post or send to PGI Customer Service (trs@pgroup.com) a reproducing example?

Thanks,
Mat

rpic · October 8, 2017, 7:08pm

Hi Mat,

thanks for the answer. I will get in touch with them.

In the meantime I had another look at the code. When I disable all the CUDA code and remove the -Mcuda=cuda8.0 it works. Only removing the CUDA code doesn’t help so the issue seems to be coming from the -Mcuda=cuda8.0

Best regards,
Richard

Topic		Replies	Views
cuda-memcheck/6.5 & pgi/15.1 error ? Legacy PGI Compilers	1	6099	February 25, 2015
OpenACC-CUDA interoperability within the same file Legacy PGI Compilers	4	4229	November 4, 2016
error for a simple OPENACC program Legacy PGI Compilers	23	12097	May 16, 2013
OpenACC c++ code doesn't compile on new pgi 2015 release Legacy PGI Compilers	2	3512	March 6, 2015
0: cudaMalloc: 4096 bytes requested; not enough memory: 700(an illegal memory access was encountered) Legacy PGI Compilers	1	3857	January 20, 2020
Accelerator Fatal Error: No NVIDIA/CUDA version... Legacy PGI Compilers	12	14856	May 15, 2017
not able to use cudaMemcpy() in openacc Legacy PGI Compilers	1	1881	November 28, 2017
Using Unified Memory on GeForce devices Legacy PGI Compilers	1	1558	February 8, 2018
using cuda libraries with OpenACC Legacy PGI Compilers	1	6088	July 13, 2012
3 Versatile OpenACC Interoperability Techniques Technical Blog	6	461	September 19, 2016

total/free CUDA memory: 0/0 using openacc with PGI 17.5

Related topics