I have been working with OpenACC and MFIX. I started getting a cuStreamSynchronize error trying to run a version of my code that used to work fine before. Following is the trailing part of the debugging output obtained with PGI_ACC_DEBUG=3. It seems like some issue occurs right after the desgrid_neigh_build_gpu function is executed. Can you please help me understand what is going on? I tried recompiling with 14.10 as well as 15.1, but the error remains. It used to work fine with 14.10. The GPU used is a Nvidia Tesla K40.
I can send the full output, as well as run other diagnostics, if needed.
Thanks much in advance
pgi_uacc_dataenterdone( devid=1 ) pgi_uacc_enter( devid=0 )Thread 1 loading module onto device 0 pgi_uacc_enter(objinfo=0xb19f20) 0x11235813 = magic 1 = numplatforms Platform 0 = 0xb19ec0 17 = id NVIDIA CUDA 0x00000001 = flags 2 = numversions Version 0 = 0xb19e40 0x33550336 = magic 0x00210104 = flags sm20 0x00430104 = pflags sm30 sm35 1 = numfunctions 1 = numbinaries Function 0 = 0xb19d80 = desgrid_neigh_build_gpu_507_gpu 507 = lineno 128x1x1 = block size -1x1x1 = grid size 0x1 = config_flag = divx(0) 1x1x1 = unroll 0 = shared memory 0 = reduction shared memory 0 = reduction arg 0 = reduction bytes 140 = argument bytes 0 = max argument bytes 1 = size arguments Binary 0 = 0xb19d20 0x000007d0 = binaryid 9360 = binarylen 0x1a1f8 = binary Version 1 = 0xb19e80 0x33550336 = magic 0x00410104 = flags sm30 0x00000104 = pflags 1 = numfunctions 1 = numbinaries Function 0 = 0xb19d80 = desgrid_neigh_build_gpu_507_gpu 507 = lineno 0xbe85f8ab269460b0 = handle[dev index=2] 0xbe81f26af2b11d2c = handle[dev index=3] 128x1x1 = block size -1x1x1 = grid size 0x1 = config_flag = divx(0) 1x1x1 = unroll 0 = shared memory 0 = reduction shared memory 0 = reduction arg 0 = reduction bytes 140 = argument bytes 0 = max argument bytes 1 = size arguments Binary 0 = 0xb19d40 0x00000bb8 = binaryid 9360 = binarylen 0x1a1f8 = binarypgi_uacc_computestart( file=/home/anirban/mfix/mfix.0019/model/./des/desgrid_mod.f, function=desgrid_neigh_build_gpu, line=495:495, line=505, devid=0 ) pgi_uacc_launch funcnum=0 argptr=0x7fffaa6cf1d0 sizeargs=0x7fffaa6cf1c0 async=-1 devid=1 Arguments to function 0 desgrid_neigh_build_gpu_507_gpu dindex=1 threadid=1 device=1: 147831 0 94633984 35 102891520 35 100794368 35 101842944 35 93323264 35 97124352 35 76283904 35 0x00024177 0x00000000 0x05a40000 0x00000023 0x06220000 0x00000023 0x06020000 0x00000023 0x06120000 0x00000023 0x05900000 0x00000023 0x05ca0000 0x00000023 0x048c0000 0x00000023Launch configuration for function=0=desgrid_neigh_build_gpu_507_gpu line=507 dindex=1 threadid=1 device=1 <<<(1155,1,1),(128,1,1),0>>> pgi_uacc_computedone( devid=1 ) pgi_uacc_cuda_wait(lineno=-99,async=-1,dindex=1) pgi_uacc_cuda_wait(sync on stream=0x5334c9a0) call to cuStreamSynchronize returned error 700: Illegal address during kernel execution