I am pretty new to CUBLAS programming and I was wondering if there is a way to calculate the transpose of the result of a matrix multiplication “in place”.
Since I have a C++ programm working with row-mayor matrices I was wondering if it was possible to hide that CUBLAS is working with colum-mayor format.
If it was possible to calculate trans(trans(A) * trans(B)) then that would be quite easy.
Of course I could calculate trans(A) * trans(B) with cublasSgemm and then transpose the result but I guess that would be quite slow.
Is there a way to do the above?
Thanks in advance for any help.