In cuda, we can use clock() to measure the running time of some code fragment in kernrl function. If I want to implement a function that similar with clock() in OpenCL,
and can measure the time on devices. Can anybody give me some advice, thanks!
You should probably take a look at OpenCL’s built-in profiling capabilities, see e.g. the documentation for the clGetEventProfilingInfo() function.
I have used clGetEventProfilingInfo() function and event object to measure a whole kernel function running time. However, if I jsut want to measure
some lines code of the kernel funcion like follows:
__kernel void clock (…)
{
unsigned int t1 = p1;
unsigned int t2 = p2;
unsigned int start_time = 0, stop_time = 0;
for (int i = 0; i < its; i++)
{
<i>start_time = clock();//this is cuda built_in function, how OpenCL can do that?</i>
repeat64(t1+=t2;t2+=t1;)
<i>stop_time = clock();//this is cuda built_in function</i>
}
out[0] = t1+t2;
......
}