I Have just started learning about CUDA. I know that a warp (32 threads) is divergent if the control flow is different for any one of those threads. for example, an “if statement” that executes different code depending on the thread index.
My question is:
suppose i have a “for” loop in my kernel that is dependent on the input data for it’s number of iterations. (i.e. the number of iterations per thread is not the same for all threads in the warp). Would this be considered divergent??
would each thread have to wait, or would they run in parallel?
( i am sorry if this question was already posted, i couldn’t find it)