cusparseSPGEMM_ALG3 in CUDA Fortran with CUDA12.2

Hi, everyone!
I am currently trying to compute sparse matrix sparse matrix product using CUSPARSE_SPGEMM using CUDA Fortran with CUDA 12.2 and HPC SDK 23.9.
However, I am having trouble compiling cusparseSPGEMM_estimatememory and cusparseSPGEMM_getnumproducts used in the calculation of CUSPARSESPGEMM_ALG3 implemented from CUDA12.
I checked the cusparse source code and found that “cusparse_SPGEMM_estimeteMemory” and “cusparse_SPGEMM_getnumproducts” used in SPGEMM_ALG3 are in cusparse.h, while they are not in cusparse.f90.
I suspect this makes it impossible to compile with CUDA Fortran.

The location of each code is as follows in my environment:



I would appreciate your reference.

I had asked this question in the NVIDIA Developer Forum, but I was told by a moderator that I should ask it in the HPC compilers forum, so I am asking it here.

Hi, yes it has come to our attention that there are some new features in cusparse in 12.x that are not supported in our Fortran interface module. We will address this in our next release.

1 Like

Thank you, @bleback .
I am relieved that it is not an error in our compilation method.
I will look forward to the next version.
Besides, we will try to write the interface code ourselves.


These two interfaces should be pretty easy to write since it sounds like you have found the source code for the module. Here is my attempt (they compile, but I haven’t actually used them yet).

! cusparseSpGEMM_getNumProducts
interface cusparseSpGEMM_getNumProducts
integer(c_int) function cusparseSpGEMM_getNumProducts(spgemmDescr, &
num_prods) bind(C, name=‘cusparseSpGEMM_getNumProducts’)
import cusparseSpGEMMDescr
type(cusparseSpGEMMDescr), value :: spgemmDescr
integer(8) :: num_prods
end function cusparseSpGEMM_getNumProducts
end interface cusparseSpGEMM_getNumProducts

! cusparseSpGEMM_estimateMemory
interface cusparseSpGEMM_estimateMemory
integer(c_int) function cusparseSpGEMM_estimateMemory(handle, opA, opB, &
alpha, matA, matB, beta, matC, computeType, alg, spgemmDescr, &
chunk_fraction, bufferSize3, buffer3, bufferSize2) &
bind(C, name=‘cusparseSpGEMM_estimateMemory’)
import cusparseHandle, cusparseSpMatDescr
import cusparseSpGEMMDescr
type(cusparseHandle), value :: handle
!pgi$ ignore_tkr (tkrd) alpha, (tkrd) beta
real(4) :: alpha, beta
type(cusparseSpMatDescr), value :: matA, matB, matC
integer(4), value :: opA, opB, computeType, alg
type(cusparseSpGEMMDescr), value :: spgemmDescr
real(4), value :: chunk_fraction
integer(8) :: bufferSize3
!pgi$ ignore_tkr buffer3
integer(4), device :: buffer3
integer(8) :: bufferSize2
end function cusparseSpGEMM_estimateMemory
end interface cusparseSpGEMM_estimateMemory

1 Like

Hi, @bleback !

Thank you very much for your kind attention.

I changed the declarations of alpha, beta and chunk_fraction from real(4) to real(8) because I am doing the calculations in double precision. Then I got the same result as ALG1!

My question is, what is chunk_fraction mathematically?
Also, from a programmatic point of view, can I define any value within the range of (0,1]?
Can I save memory with smaller values?

Also, when I compute the matrix product in the github sample code, ALG1 gives buffersize1=1171, buffersize2=5687.

In contrast, for ALG3, chunk_fraction=0.2d0 or 1d0,
buffersize1=1171, buffersize3=37911, buffersize2=43358.
If chunk_fraction is set to 0.01, then
buffersize1=1171, buffersize3=48533271, buffersize2=48538718.

Would using ALG3 result in a smaller buffersize?
Also, is it possible to change the chunk_fraction to use less memory than ALG1?


Hi, everyone.

Does anyone know anything about this?
I’m very troubled.


This and your subsequent questions are now probably more appropriate for the libraries forum you were previously on.

That’s true.
The compilation problem was solved in this forum.
Thank you very much.
I will ask in the library forum.