The warp will loop for as long as there are any threads still in the loop. It won’t serialize them. So in your example, it will take just 31 iterations, assuming this warp’s tid values range from 0 to 31. (Careful, since threadIdx.x will go up to the number of threads in the block, not just the number of threads in the warp.)
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Question about divergence and loops | 7 | 7102 | November 21, 2010 | |
How many divergent branches can actually be discussed in parallel? | 5 | 3058 | October 1, 2009 | |
thread local 'for loop' question thread parallel for loop execution | 5 | 3409 | August 29, 2007 | |
Must all threads execute the same code? "Branch divergence occurs only within a warp" | 5 | 2986 | December 28, 2008 | |
Branch divergence and executing serial could be misinterpretted. | 8 | 3988 | December 21, 2016 | |
Cost of serialization. The cost of wrap execution serialization | 5 | 7125 | July 9, 2008 | |
Warp branching | 11 | 10317 | October 26, 2010 | |
branch diveragence with if/while same as if one of the threads in a warp returning | 18 | 2824 | December 13, 2011 | |
warp divergence triggered by for loop | 2 | 1656 | April 2, 2018 | |
Impact of control flow on thread performance | 11 | 13957 | January 17, 2008 |