I have a question: As far as I know, the shuffle instructions are used mostly when the lane id can be known at compilation time. Now, I have to get the lane id at runtime, how to guarantee correctness?
The code looks like this:
int item_index = ch * NUM_STATES + all_states[s];
//which lane stores the state?
int target_lane = item_index % WARP_SIZE;
//what is the index for that lane?
int target_index = item_index / WARP_SIZE;
all_states[s] = __shfl_sync(0xffffffff, local_table[target_index], target_lane, WARP_SIZE);
Target lane is the lane id for shuffle, which is calculated at runtime. local_table is an array, which is declared like"int local_table[N];", besides, local_table is read-only. I never update the value of local_table.
When I run my program, I found that the result of __shfl_sync is incorrect.
Thanks a lot for any help!