I have been wondering why in the matrixMul ( CUDA samples), there is the condition that outer matrix dimensions must be equal ? Is it an optimization ? Cause from what I understood matrices are stored in row-major style. Is it the reason ? or something else ?
Thank you for your enlightenment,