Unexpected cuStreamSynchronize error

Hi,
I have been working with OpenACC and MFIX. I started getting a cuStreamSynchronize error trying to run a version of my code that used to work fine before. Following is the trailing part of the debugging output obtained with PGI_ACC_DEBUG=3. It seems like some issue occurs right after the desgrid_neigh_build_gpu function is executed. Can you please help me understand what is going on? I tried recompiling with 14.10 as well as 15.1, but the error remains. It used to work fine with 14.10. The GPU used is a Nvidia Tesla K40.

I can send the full output, as well as run other diagnostics, if needed.

Thanks much in advance

pgi_uacc_dataenterdone( devid=1 )
pgi_uacc_enter( devid=0 )Thread 1 loading module onto device 0
pgi_uacc_enter(objinfo=0xb19f20)
    0x11235813 = magic
             1 = numplatforms
 Platform 0 = 0xb19ec0
            17 = id  NVIDIA CUDA 0x00000001 = flags
             2 = numversions
  Version 0 = 0xb19e40
    0x33550336 = magic    0x00210104 = flags sm20    0x00430104 = pflags sm30 sm35
             1 = numfunctions             1 = numbinaries
   Function 0 = 0xb19d80 = desgrid_neigh_build_gpu_507_gpu
           507 = lineno
    128x1x1 = block size
        -1x1x1 = grid size    0x1 = config_flag = divx(0)
    1x1x1 = unroll
0 = shared memory
             0 = reduction shared memory
             0 = reduction arg             0 = reduction bytes
           140 = argument bytes
             0 = max argument bytes
             1 = size arguments
   Binary 0 = 0xb19d20    0x000007d0 = binaryid
          9360 = binarylen
       0x1a1f8 = binary
  Version 1 = 0xb19e80
    0x33550336 = magic    0x00410104 = flags sm30    0x00000104 = pflags
             1 = numfunctions
             1 = numbinaries
   Function 0 = 0xb19d80 = desgrid_neigh_build_gpu_507_gpu
           507 = lineno    0xbe85f8ab269460b0 = handle[dev index=2]
    0xbe81f26af2b11d2c = handle[dev index=3]
    128x1x1 = block size
        -1x1x1 = grid size
    0x1 = config_flag = divx(0)    1x1x1 = unroll
0 = shared memory
             0 = reduction shared memory
             0 = reduction arg
             0 = reduction bytes
           140 = argument bytes
             0 = max argument bytes
             1 = size arguments
   Binary 0 = 0xb19d40
    0x00000bb8 = binaryid
          9360 = binarylen
       0x1a1f8 = binarypgi_uacc_computestart( file=/home/anirban/mfix/mfix.0019/model/./des/desgrid_mod.f, function=desgrid_neigh_build_gpu, line=495:495, line=505, devid=0 )
pgi_uacc_launch funcnum=0 argptr=0x7fffaa6cf1d0 sizeargs=0x7fffaa6cf1c0 async=-1 devid=1
Arguments to function 0 desgrid_neigh_build_gpu_507_gpu dindex=1 threadid=1 device=1:
            147831          0   94633984         35  102891520         35  100794368         35
         101842944         35   93323264         35   97124352         35   76283904         35
        0x00024177 0x00000000 0x05a40000 0x00000023 0x06220000 0x00000023 0x06020000 0x00000023
        0x06120000 0x00000023 0x05900000 0x00000023 0x05ca0000 0x00000023 0x048c0000 0x00000023Launch configuration for function=0=desgrid_neigh_build_gpu_507_gpu line=507 dindex=1 threadid=1 device=1 <<<(1155,1,1),(128,1,1),0>>>
pgi_uacc_computedone( devid=1 )
pgi_uacc_cuda_wait(lineno=-99,async=-1,dindex=1)
pgi_uacc_cuda_wait(sync on stream=0x5334c9a0)
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution

Hi anirbanjana,

This error indicates that there’s a bad address in the being accessed in the kernel. Normally I look to out-of-bounds accesses and other issues in the code, but since this code worked with an earlier compiler, I’d suspect that the new version is generating a bad reference. It could be that new version exposes a problem in the code as well. Though without investigation I can’t tell.

Can you send me an updated version of MFIX? The version I have is a bit old. I have the Mueller 74k and 9K test cases. Are these sufficient to reproduce the failure?

Thanks,
Mat