How Does TMA Work for Writing from Shared Memory (sS) to Global Memory (gD)?

Could you explain how TMA works? For example, when we write from the shared memory Tensor sS to the global memory Tensor gD, it seems like the data is written sequentially, i.e., sS[i] directly maps to gD[i]. Is this correct?