Bug in cuBlasLt?

I’m very interested by cuBlasLt because it should work with strided matrices in row-order format.

I’ve found the example provided by M. Nicely at https://github.com/mnicely/cublasLt_examples/tree/master/cublasLt_sgemm.

The example works nicely with matrices in row-order format. But it fails when the matrices are non-square…

in the example, changing:

calculate( i, i, i );

to:

calculate( i, i, i+1 );

leads to the error:

CUDA error at cublasLt_sgemm.cu:172 code=7(CUBLAScublasLt_STATUS_INVALID_VALUE) 
"cublasLtMatmulAlgoGetHeuristic( ltHandle, operationDesc, Adesc, Bdesc, Cdesc, Cdesc, preference, 1, &heuristicResult, &returnedResults )"

I’ve read the example code and I think it’s perfectly fine.

I’ve also done the same test using column-order format and in this case, the modified test works.

Is it a cuBlasLt bug?

This is a cublasLt bug and will be fix in the next CUDA release.

Thanks for letting me know.