Using threads in device function.

Hi,

I have got a problem as:

device device_fun(float* Array1, int* Array2, int size)
{

for(int i=0 ;i<700 ;i++)
{

//some manipulation here.

}
}

I have a kernel :

global global_fun()
{

//body here.

//calling device functions
device_fun(Array1, Array2, size);
}

main()
{
global_fun<<<2, 256>>>() ;
}

I have problem that :

  • I have problem that how we should use threadIdx.x and /or threadIdx.y and blockIdx.x etc in device function.
    I know that it will depands on organization of that problem.

    I only want to know that is any rule or trick to decide the handling of loops?

Thanks :
Kundan