Getting a CPU lock up error when compiling cuda fortran

I occasionally get a lock up error when I try to compile cuda fortran code. As far as I can tell, it occurs randomly. The full print out looks something like this when it happens

datatypes.f90:
cuSOLVER_interfaces.cuf:
gpu_kernals.cuf:
ptxas info    : 56 bytes gmem, 8 bytes cmem[14]
ptxas info    : Function properties for gpu_kernals_findchannel_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for gpu_kernals_releasechannel_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Function properties for gpu_kernals_auxv_gpu_
    56 bytes stack frame, 52 bytes spill stores, 52 bytes spill loads
ptxas info    : Function properties for gpu_kernals_auxv1gpu_
    56 bytes stack frame, 52 bytes spill stores, 52 bytes spill loads
ptxas info    : Function properties for gpu_kernals_auxc_gpu_
    80 bytes stack frame, 76 bytes spill stores, 76 bytes spill loads
ptxas info    : Function properties for gpu_kernals_xint_gpu_
    96 bytes stack frame, 92 bytes spill stores, 92 bytes spill loads
ptxas info    : Function properties for gpu_kernals_xintu_gpu_
    208 bytes stack frame, 216 bytes spill stores, 244 bytes spill loads
ptxas info    : Compiling entry function 'gpu_kernals_eint1gpu_' for 'sm_20'
ptxas info    : Function properties for gpu_kernals_eint1gpu_
    128 bytes stack frame, 140 bytes spill stores, 172 bytes spill loads
ptxas info    : Used 63 registers, 45064 bytes smem, 344 bytes cmem[0], 16 bytes cmem[16]
ptxas info    : Compiling entry function 'gpu_kernals_eint2gpu_' for 'sm_20'
ptxas info    : Function properties for gpu_kernals_eint2gpu_
    168 bytes stack frame, 264 bytes spill stores, 504 bytes spill loads
ptxas info    : Used 63 registers, 32808 bytes smem, 280 bytes cmem[0], 16 bytes cmem[16]
ptxas info    : Compiling entry function 'gpu_kernals_vec2matrix_gpu_' for 'sm_20'
ptxas info    : Function properties for gpu_kernals_vec2matrix_gpu_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 18 registers, 80 bytes cmem[0]
ptxas info    : Compiling entry function 'gpu_kernals_isqrtsmx_gpu_' for 'sm_20'
ptxas info    : Function properties for gpu_kernals_isqrtsmx_gpu_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 21 registers, 72 bytes cmem[0], 16 bytes cmem[16]
ptxas info    : Compiling entry function 'gpu_kernals_formd_gpu_' for 'sm_20'
ptxas info    : Function properties for gpu_kernals_formd_gpu_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 28672 bytes smem, 152 bytes cmem[0]
dfratomgpu.cuf:
gpualg2.cuf:
dfratom.f:
2eint_binarysearch.cuf:
ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function 'binary_search_binary_search_gpu_' for 'sm_20'
ptxas info    : Function properties for binary_search_binary_search_gpu_
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 31 registers, 84 bytes smem, 80 bytes cmem[0]
datatypes.f90:
cuSOLVER_interfaces.cuf:

Message from syslogd@tc at May  8 04:07:07 ...
 kernel:BUG: soft lockup - CPU#6 stuck for 67s! [pgf901:6587]

Message from syslogd@tc at May  8 04:08:31 ...
 kernel:BUG: soft lockup - CPU#6 stuck for 67s! [pgf901:6587]

Sometimes this will compile fine, sometimes it will stop when compiling other files (not always at cuSOLVER_interfaces.cuf). I’m using pgi 16.9. Does anyone know what to do?

Hi MariuszK,

I have not seen this before. Doing a web search, I see lots of similar errors but all appear to be VM, hardware or driver related. No idea why a compilation would trigger this.

If you can send us your code (trs@pgroup.com), we can try to reproduce your error here. Also, please send us your hardware details so we can try an best match it to a system we have here. However, my best guess is that the problem is specific to your system.

-Mat