Hi everyone.
I Have just started learning about CUDA. I know that a warp (32 threads) is divergent if the control flow is different for any one of those threads. for example, an “if statement” that executes different code depending on the thread index.
My question is:
suppose i have a “for” loop in my kernel that is dependent on the input data for it’s number of iterations. (i.e. the number of iterations per thread is not the same for all threads in the warp). Would this be considered divergent??
would each thread have to wait, or would they run in parallel?
Thank you
( i am sorry if this question was already posted, i couldn’t find it)
Hi everyone.
I Have just started learning about CUDA. I know that a warp (32 threads) is divergent if the control flow is different for any one of those threads. for example, an “if statement” that executes different code depending on the thread index.
My question is:
suppose i have a “for” loop in my kernel that is dependent on the input data for it’s number of iterations. (i.e. the number of iterations per thread is not the same for all threads in the warp). Would this be considered divergent??
would each thread have to wait, or would they run in parallel?
Thank you
( i am sorry if this question was already posted, i couldn’t find it)
Yes, a loop that terminates at different times for threads in the same warp creates divergence. I’m not sure whether the threads that finish the loop first have to wait at the end of the loop for the rest of the threads in their warp. Different warps can get out of sync with no speed penalty.
Yes, a loop that terminates at different times for threads in the same warp creates divergence. I’m not sure whether the threads that finish the loop first have to wait at the end of the loop for the rest of the threads in their warp. Different warps can get out of sync with no speed penalty.