Hi,
I have got a problem as:
device device_fun(float* Array1, int* Array2, int size)
{
for(int i=0 ;i<700 ;i++)
{
//some manipulation here.
}
}
I have a kernel :
global global_fun()
{
//body here.
//calling device functions
device_fun(Array1, Array2, size);
}
main()
{
global_fun<<<2, 256>>>() ;
}
I have problem that :
-
I have problem that how we should use threadIdx.x and /or threadIdx.y and blockIdx.x etc in device function.
I know that it will depands on organization of that problem.I only want to know that is any rule or trick to decide the handling of loops?
Thanks :
Kundan