Hello,
I have been wondering why in the matrixMul ( CUDA samples), there is the condition that outer matrix dimensions must be equal ? Is it an optimization ? Cause from what I understood matrices are stored in row-major style. Is it the reason ? or something else ?
Thank you for your enlightenment,
Bests
I believe the reason is that it is just a simple example of matrix multiplication using tiling. It is not robust enough to handle non-square matrices.
Check out https://stackoverflow.com/questions/18815489/cuda-tiled-matrix-matrix-multiplication-with-shared-memory-and-matrix-size-whic for a more robust example.