When I am trying to build my customized kernel in cutlass, I notice that there seems to be some shape options for the block, warp and thread, instead of giving any possible shape configurations.
I wonder if it is possible to change the code to make it work using SIMT with shape where m = 1, or other similar small m. I read through the code and still do not understand which part(computation? data movement?) on which level (warp? thread?)lexactly restrict me to do so.