cudaEventQuery() blocks

I am calling cudaEventQuery() in a periodic ITIMER callback register by the main program.
The thread is at cudaDeviceSynchronize() waiting for a kernel to finish.

I am seeing that cudaEventQuery() does not return and get blocked.

I have attached the program to this file and the callstack when cudaEventQuery() gets stuck.

this is on Nvidia Tesla 2070 GPU.

I appreciate any information/help on eliminating this problem/bug.

(gdb) bt
#0 0x00000037f520e034 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000037f5209345 in _L_lock_868 () from /lib64/libpthread.so.0
#2 0x00000037f5209217 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f7bb6fd75b7 in ?? () from /usr/lib64/libcuda.so.1
#4 0x00007f7bb6fd575a in ?? () from /usr/lib64/libcuda.so.1
#5 0x00007f7bb70062e3 in ?? () from /usr/lib64/libcuda.so.1
#6 0x00007f7bb700c3ec in ?? () from /usr/lib64/libcuda.so.1
#7 0x00007f7bb6fc95d8 in ?? () from /usr/lib64/libcuda.so.1
#8 0x00007f7bb6fb9c35 in ?? () from /usr/lib64/libcuda.so.1
#9 0x00007f7bb6a5ad57 in ?? () from /usr/local/cuda/lib64/libcudart.so.4
#10 0x00007f7bb6a8c4f2 in cudaEventQuery () from /usr/local/cuda/lib64/libcudart.so.4
#11 0x0000000000400e8d in event_handler (signum=14) at event_sampling.cu:40
#12
#13 0x00007f7bb7003791 in ?? () from /usr/lib64/libcuda.so.1
#14 0x00007f7bb6fd5786 in ?? () from /usr/lib64/libcuda.so.1
#15 0x00007f7bb70062e3 in ?? () from /usr/lib64/libcuda.so.1
#16 0x00007f7bb7006646 in ?? () from /usr/lib64/libcuda.so.1
#17 0x00007f7bb6fd5839 in ?? () from /usr/lib64/libcuda.so.1
#18 0x00007f7bb6fc86e0 in ?? () from /usr/lib64/libcuda.so.1
#19 0x00007f7bb6fa7d62 in ?? () from /usr/lib64/libcuda.so.1
#20 0x00007f7bb6a5e9d3 in ?? () from /usr/local/cuda/lib64/libcudart.so.4
#21 0x00007f7bb6a9318c in cudaDeviceSynchronize () from /usr/local/cuda/lib64/libcudart.so.4
#22 0x00000000004012b3 in main (argc=1, argv=0x7fff20bff048) at event_sampling.cu:157
(gdb)
event_query.cu (3.57 KB)