Hey guys,
is it just me or is there a bug in the CUDA SDK Sample ‘Matrix Multiplication (CUBLAS)’ of CUDA v6.0 (CUDA Samples :: CUDA Toolkit Documentation)?
There we want to compute A*B=C, where the dimensions of A, B, and C are defined as multiples of
A: 4x2, B: 4x2, C: 4x2 (see lines 206-211). Obviously the dimensions don’t match at all and I was wondering why this even works.
From my understanding, you have to change these lines to multiples of
A: 4x2, B: 2x4, C: 4x4 (or some other constants, which match the dimension constraints).
In addition we have to change the last argument of the function cublasSgemm (the leading dimension of C) to matrix_size.uiWC. With these changes, everything works fine (also with other constants).
Want do you think? Is there a error in my reasoning? Sorry, if this topic is already discussed elsewhere. I didn’t found any mention of this problem in the web.