I have got a problem as:
device device_fun(float* Array1, int* Array2, int size)
for(int i=0 ;i<700 ;i++)
//some manipulation here.
I have a kernel :
//calling device functions
device_fun(Array1, Array2, size);
global_fun<<<2, 256>>>() ;
I have problem that :
I have problem that how we should use threadIdx.x and /or threadIdx.y and blockIdx.x etc in device function.
I know that it will depands on organization of that problem.
I only want to know that is any rule or trick to decide the handling of loops?