cusolverDn<t>getrf() buffer_size issue

I noticed that cusolverDnSgetrf_bufferSize, cusolverDnDgetrf_bufferSize, cusolverDnCgetrf_bufferSize, and cusolverDnZgetrf_bufferSize all take the matrix being factored as an argument. So, for example:

cusolverStatus_t 
cusolverDnZgetrf_bufferSize(cusolverDnHandle_t handle,
                      int m,
                      int n,
                      cuDoubleComplex *A,
                      int lda,
                      int *Lwork );

The pointer to the complex matrix A is one of the arguments. This means that I can only know the size of the needed workspace after I have allocated the complex matrix A!

What I want to do is calculate ahead of time if I have enough GPU memory to hold the complex matrix A and the workspace.

Am I misunderstanding the use of cusolverDnZgetrf? How can I do what I want to do?

The amount of GPU memory needed to hold the A matrix itself is just mnsizeof(cuDoubleComplex)

This function is used to calculate the workspace size. This is extra space (besides that allocated for the A matrix) that is needed by the function to perform its work.

The size of this space may depend on the structure of the A matrix, and so you must provide the complete A matrix before the buffer size can be calculated.

Thanks for the reply, Robert.

Your answer makes sense for the sparse case, but since the matrix A is dense, then the structure of A is already known. In other words, cusolverDnZgetrf_buffersize should know the structure of A without me having to pass it the actual matrix A.

By “the structure of A” I am referring to characteristics of A which may affect the algorithm choices/path. Not the sparsity pattern.

Anyway, that is the behavior of the API. There is no alternate method provided. If you’d like to see a change in any CUDA behavior, you should file a bug.

Thanks, Robert, I submitted a bug report.