in the book “programming massively parallel processors” by david b. kirk et al,
it says that i need “multiple passes” that follow threads for ‘then’ part and ‘else’ part.
could you experts explain what “passes” mean in the case?
in my case, i have a kernel that uses a shared memory only under certain condition.
that is.
global kernel( )
shared d_A
d_A[threadIDx.x] = global_C[globalIDx.x] /* copy some data from global_C to shared memory d_A */
if(threadIDx.x > 0 && threadIDx.x < 100)
global_b[globalIDx.x] = d_A[threadIDx.x] /* i can use d_A only under certain condition and this involves multiple calculations of d_A[] */
else
global_b[globalIDx.x] = global_C[globalIDx.x] /* otherwise i have to get b[] by performing multiple calculations using global_C[]*/
NOTE: i made code very simple but hopefully you get the main idea.
does ‘pass’ mean threadIDx.x ?
somehow for this example i don’t see any diverge problem and i wonder why?
Any comments are well appreciated and many thanks in well advance…