Why does mbarrier test_wait
include a phase bit? Is it meant for controlling task sequences? For example, could it manage simultaneous GEMM and Softmax operations by assigning each a different phase?
Why does mbarrier test_wait
include a phase bit? Is it meant for controlling task sequences? For example, could it manage simultaneous GEMM and Softmax operations by assigning each a different phase?