cublas<t>gemv() documentation

The netlib documentation for the SGEMV(TRANS,M,N,ALPHA,A,LDA,X,INCX,BETA,Y,INCY) routine says that:

*  M      - INTEGER.
*           On entry, M specifies the number of rows of the matrix A.
*           M must be at least zero.
*           Unchanged on exit.

*  N      - INTEGER.
*           On entry, N specifies the number of columns of the matrix A.
*           N must be at least zero.
*           Unchanged on exit.

*  A      - REAL             array of DIMENSION ( LDA, n ).
*           Before entry, the leading m by n part of the array A must
*           contain the matrix of coefficients.
*           Unchanged on exit.

*  LDA    - INTEGER.
*           On entry, LDA specifies the first dimension of A as declared
*           in the calling (sub) program. LDA must be at least
*           max( 1, m ).
*           Unchanged on exit.

This means that A should be an LDA x N matrix, with LDA > max(1,M), and the the matrix multiplication will involve only the upper M x N submatrix of A.

The CUDA documentation of cublasgemv() says

m number of rows of matrix A.

n number of columns of matrix A.

A <type> array of dimension lda x n with lda >= max(1,n) if transa==CUBLAS_OP_N and lda x m with lda >= max(1,n) otherwise.

Should the last statement read as

A <type> array of dimension lda x n with lda >= max(1,m) if transa==CUBLAS_OP_N and lda x m with lda >= max(1,n) otherwise.

(the first inequality should be lda >= max(1,m) instead of lda >= max(1,n)) ?

Thanks.

I think this is a typo. A correction and a statement, if this is a typo or not, would be great.

I checked with the CUBLAS team. The CUBLAS documentation is in error, and the CUBLAS implementation matches the reference.

Thank you very much as always for your help, Njuffa :-)