Why outer matrix dimensions shoud be equal in matrixMul ?

Hello,

I have been wondering why in the matrixMul ( CUDA samples), there is the condition that outer matrix dimensions must be equal ? Is it an optimization ? Cause from what I understood matrices are stored in row-major style. Is it the reason ? or something else ?

Thank you for your enlightenment,
Bests

I believe the reason is that it is just a simple example of matrix multiplication using tiling. It is not robust enough to handle non-square matrices.

Check out https://stackoverflow.com/questions/18815489/cuda-tiled-matrix-matrix-multiplication-with-shared-memory-and-matrix-size-whic for a more robust example.