How to get cuda-gdb to break within __device__ functions?

As the title says, how do you get cuda-gdb to break within device functions? eg:

1 device void foo() {
2
3 do something;
4 do something2;
5
6 }
7
8 global void bar() {
9
10 foo();
11
12}

i set cuda-gdb to break at line 4; but it never actually breaks there. AFAIK its because device functions are implicitly inlined, but even when I prefix it with noinline it doesn’t work.

Help?