Getting thread context in device emulation Finding the return address of a function

Hi,

I would like to find out the return address for a device function (for use only in device emulation), and I was wondering if someone knew how to do that. I’m using linux.

The following structure illustrates my point:-

__device__ inline void helperfunc1() 

{

      printf("This function was called from address %x\n", [return address function] );

}

__global__ void kernel(arguments)

{

// code

helperfunc1();

// code

}

I tried using __builtin_return_address(1), which is a gcc function (and works on my system).

On trying to compile my code (using make emu=1), I get the following error:-

obj/emurelease/genop.cu_o: In function `void gpu_genop<50u, 0u>(float, float*, float*, unsigned int)':

genop.cu.c:(.text+0xb3d): undefined reference to `__cuda_return_address’

genop.cu.c:(.text+0xb9c): undefined reference to `__cuda_return_address’

collect2: ld returned 1 exit status

make: *** [out/emurelease/genop] Error 1

What is the function __cuda_return_address() ? Can someone explain how to make it work?

My actual problem lies in uniquely identifying (out of potentially many locations) where a function was called in a thread. I could do if I had the instruction pointer for the return.

Thanks for the help.