How to measure time IN a kernel?

Hi!

I want to measure the time elapsed IN a kernel, not how long the kernel execution overall is.

Something like this:

some_kernel(){

start timer

do_something1

<-----measure the time elapsed till here------->

do_something2

<-----measure the time elapsed till here------->

do_something3

<-----measure the time elapsed till here------->

}

Is there any way to do that?

You can use a bit of inline assembly to read the PTX cycle counter:

__device__ unsigned int get_clock(void)

{

  unsigned int clock;

  asm("mov.u32 %0, %clock;" : "=r"(clock) );

  return clock;

}

It seems to run at half the shader clock.

You can use a bit of inline assembly to read the PTX cycle counter:

__device__ unsigned int get_clock(void)

{

  unsigned int clock;

  asm("mov.u32 %0, %clock;" : "=r"(clock) );

  return clock;

}

It seems to run at half the shader clock.

Thank you very much!
That works very well for me

Thank you very much!
That works very well for me

You could also use the built-in method!

clock_t clock();

See section B.10 of the CUDA 3.1 programming guide for more info. The inline PTX is just is effective though…

You could also use the built-in method!

clock_t clock();

See section B.10 of the CUDA 3.1 programming guide for more info. The inline PTX is just is effective though…

Thanks, Profquail!

I somehow had in the back of my head that there was a CUDA function for this already, but apparently missed it when I quickly scanned the Programming Guide appendices to check.

Thanks, Profquail!

I somehow had in the back of my head that there was a CUDA function for this already, but apparently missed it when I quickly scanned the Programming Guide appendices to check.

Thank you both very much, love this board :)

Thank you both very much, love this board :)