Here is the code:
I launch 256 threads in a block, and it’s divided into 2 warp groups.
int warp_group_idx = cutlass::canonical_warp_group_idx();
if (warp_group_idx == 0) {
// do some thing
__sync_threads();
} else {
// do others
}
Is it legal to use like this?
Is there any documentation about this?
In my knowledge, it’s illegal and will cause dead lock.
But I see it used in Flash MLA: FlashMLA/csrc/flash_fwd_mla_kernel.h at b549289fb4ce2a0a4a1d5e9e615a17a2e09f3286 · deepseek-ai/FlashMLA · GitHub
Curefab
February 25, 2025, 11:07am
2
That seems to be suspicious.
Could it correspond to the __syncthreads()
in line 400, which also is inside a loop from n_block
to n_block_min
?
Thank you! Seems that’s the fact.
Can __syncthreads() be used in this situation? I have never seen some documentation about that.
if(x > 0) {
__syncthreads();
} else {
__syncthreads();
}
In this code, the two __syncthreads()
are the same synchronization point? Why is that?
Curefab
February 25, 2025, 3:48pm
4
I would avoid this, if possible.
__syncthreads() is translated into bar.sync instructions.
They are available as .aligned and non-aligned variants.
The .aligned ones have to be a single instruction, the non-aligned ones can have the same instruction at multiple locations for the still running threads of a warp.
But at the same time the C++ compiler has to know that those two __syncthreads() instructions are executed together.
Let’s avoid it.
For example replace
if (b) {
T1();
__syncthreads();
T2();
} else {
F1();
__syncthreads();
F2();
}
with
if (b) {
T1();
} else {
F1();
}
__syncthreads();
if (b) {
T2();
} else {
F2();
}
Wow, I see that. So how can the compiler know they correspond to each other?
Does it search __syncthreads
in the other branch, and relate them together?
Curefab
February 26, 2025, 3:59am
6
I do not believe, it fully can. So it would perhaps always translate to the non-aligned variant.
The construct is quite brittle, if not managed well by the programmer.
Whether it works, can be seen at runtime.
The compiler has to make sure to keep the instruction and the order of instructions.
1 Like