What does the "row" and "col" mean in mma.sync.aligned.m16n8k4.row.col.f32.tf32.tf32.f32

Here we assign the data by each thread, according to the graph, we are not reading from memory, and there is no difference between row major or col major. Why we have row and col here?