In mma.sp, what is Ti of metadata?

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#matrix-fragments-for-sparse-mma-m16n8k16-with-f16-and-bf16-types

Ti refers to the the thread. A tensor core op is a warp-wide operation. Each thread in the warp holds input (and output) for the op. In the case of the metadata, it is also contained in one register per thread in the warp. The table you have excerpted shows, for each thread, which area its metadata applies to.

In figure 83, we see that a sparse matrix suitable for this kind of sparse matrix-matrix multiply, has a particular sparsity pattern. You cannot have arbitrary sparsity pattern. Instead, considering each 4-way square set or “chunk” of elements, exactly 2 of those 4 elements are allowed to be significant, and the other two must be zero.

This metadata selects which quadrants of the square chunk have non-zero data. An example of the relationship between chunk arrangement and metadata is given in figure 84.

1 Like

I see… But why here especially use Ti? Otherwhere uses T0, T1…

Also I see a T_2i. What does this mean?