Error message when stepping out of __global__ function in cuda-gdb

When I try to step out of a global function in cuda-gdb, I have the following error message:

(cuda-gdb) s
0x00002aaaac219110 in cuVDPAUCtxCreate () from /lib64/
(cuda-gdb) s
Single stepping until exit from function cuVDPAUCtxCreate,
which has no line number information.
cuda-gdb/7.12/gdb/infrun.c:2794: internal-error: resume: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

In my code, the first line of host code after the global function is cudaDeviceSynchronize(). When I backtrace the debugging process, this is what I get:

(cuda-gdb) bt
#0  0x00002aaaac219110 in cuVDPAUCtxCreate () from /lib64/
#1  0x00002aaaac219504 in cuVDPAUCtxCreate () from /lib64/
#2  0x00002aaaac11e65c in cudbgApiDetach () from /lib64/
#3  0x00002aaaac11e810 in cudbgApiDetach () from /lib64/
#4  0x00002aaaac052b5a in ?? () from /lib64/
#5  0x00002aaaac1a4a9d in cuCtxSynchronize () from /lib64/
#6  0x00000000005163ad in cudart::cudaApiDeviceSynchronize() ()
#7  0x000000000053b04d in cudaDeviceSynchronize ()

Does anyone know if this is a cuda-gdb bug or my own problem in the code? Thank you.