I see the kernel in cublas like: ampere_ldg8_:
ldg means load form global memory, how about 8? load 8bit once?
and cutlass: cutlass_gemm_align8
align : I have read the doc, but not very clear
Are they the same mean?
I see the kernel in cublas like: ampere_ldg8_:
ldg means load form global memory, how about 8? load 8bit once?
and cutlass: cutlass_gemm_align8
align : I have read the doc, but not very clear
Are they the same mean?