I would like to know is it possible bellow explained algorithm in CUDA:
Given N points, I give each the cuda threads one point and have programming to find the local maxima about something object function in each thread.
Namely,
mycuda_kernelfunction(...) {
while(1) {
  1. given one point in this thread, find the next point to minimize the object function.
  2....N-1. something jobs...
  N. stop the while loop if some condition is true, otherwise continue.
}
Is it possible above the codes…
In this above codes, each threads will run respective difference(not equal) step. For example, some thread is processing at 2 step, another one is processing at 14 step, some of the others is processing at Nth step.
As I understood, SIMD is to be one single instructment multiple data, so the group threads in SMs(or warp) must be processing at the same steps. I am confusing these words and the concept of thread programming model in CUDA.
I think the above code is impossible under threads of the CUDA.