My code runs with version 15.10.
After updated to version 16.1, my code frequently stopped with message like:
Out of memory allocating 13421760 bytes of device memory
total/free CUDA memory: 2147155968/13303808
Present table dump for device: NVIDIA Tesla GPU 0, compute capability 2.1
host:0xfed2360 device:0x2031a0000 size:13421760 presentcount:1+0 line:164 name:rublten_edge
host:0x7f6be80bfe80 device:0x202020000 size:13421760 presentcount:1+0 line:164 name:mass_edge
host:0x7ffcc849e4b0 device:0x201f20a00 size:96 presentcount:1+0 line:164 name:descriptor
host:0x7ffcc849ee50 device:0x201f20000 size:96 presentcount:1+0 line:164 name:descriptor
call to cuMemAlloc returned error 2: Out of memory
At first I am thinking this is a problem with GPU memory not cleared after previous failure. Then I am thinking it could be related to GPU and CPU bindings, as I have 6 GPU cards, and running 12 MPI talks.
Appreciate your help,