cuSPARSE generic procedure could not be resolved NVFORTRAN-S-0155

yunus.altintop.2 · November 8, 2021, 9:58am

Hello,

I have a symmetric matrix that I store just the upper part of it in the csr format. Since the symmetric matrices in the cuSPARSE library are not usually recommended, I convert the upper triangular stored csr matrix to a general csr format in the first part of the code. Then I multiplied it with a matrix in the shape of N=3, P=1 in this example.

Although the conversion part is not optimal, it worked fine and gave the correct result for the matrix. However, the second part of the code (matrix matrix multiplication) gave the following errors.

NVFORTRAN-S-0155-Could not resolve generic procedure cusparsecreatecsr (csr.f90: 122)
NVFORTRAN-S-0155-Could not resolve generic procedure cusparsespmm_buffersize (csr.f90: 127)
NVFORTRAN-S-0155-Could not resolve generic procedure cusparsespmm (csr.f90: 139)
0 inform, 0 warnings, 3 severes, 0 fatal for sptoden

The code is as follows:

program sptoden

use openacc
use cusparse
implicit none

! CUDA API variables
type(cusparseHandle) :: h
type(cusparseMatDescr) :: descrA

INTEGER(8) :: bsize

type(c_devptr) :: buffer
type(cusparseSpMatDescr) :: matA
type(cusparseDnMatDescr) :: matZ
type(cusparseDnMatDescr) :: matY
type(cusparseDnMatDescr) :: matV

INTEGER :: status 

DOUBLE PRECISION :: alpha, beta

INTEGER, dimension(3) :: nnzPerRowZ
INTEGER :: nna

DOUBLE PRECISION, dimension(5) :: csrValA
INTEGER, dimension(5) :: csrColIndA
INTEGER, dimension(4) :: csrRowPtrA

DOUBLE PRECISION, allocatable :: csrValZ(:)
INTEGER, allocatable :: csrColIndZ(:)
INTEGER, allocatable :: csrRowPtrZ(:)
DOUBLE PRECISION, dimension(3,3) :: Z

DOUBLE PRECISION, dimension(3) :: Y, V

INTEGER :: M,I

INTEGER :: N,P

N = 3
P = 1

Y = (/ 1.0d0, 2.0d0, 3.0d0 /)

csrValA = (/ 1.0d0, 3.0d0, -2.0d0, 5.0d0, -1.0d0 /)
csrRowPtrA = (/ 1, 4, 5, 6 /)
csrColIndA = (/ 1, 2, 3, 2, 3 /)


!  The expansion of the sparse matrix A (It is supposed to be upper triangular symmetric.)
!  1.0   3.0  -2.0
!  0.0   5.0   0.0
!  0.0   0.0  -1.0
!  ---------------------------------------------------------------------------------------

Z = 0.0d0

! Transformation from symmetric to general matrix----------------------------------------- 
status = cusparseCreate(h)
if (status /= CUSPARSE_STATUS_SUCCESS) &
  write(*,*) 'cusparseCreate error for transformation: ', status
status = cusparseCreateMatDescr(descrA)
status = cusparseSetMatType(descrA, CUSPARSE_MATRIX_TYPE_GENERAL)
status = cusparseSetMatIndexBase(descrA, CUSPARSE_INDEX_BASE_ONE)
status = cusparseSetStream(h, acc_get_cuda_stream(acc_async_sync))	

!$ACC data copyin(csrValA, csrRowPtrA, csrColIndA) copy(Z)
!$acc host_data use_device(csrValA, csrRowPtrA, csrColIndA, Z)
status = cusparseDcsr2dense(h, 3, 3, descrA, csrValA, csrRowPtrA, csrColIndA, Z, 3)
!$acc end host_data
!$acc end data
! ----------------------------------------------------------------------------------------

! Conversion from non-symmetrical to symmetrical------------------------------------------ 
DO M = 2,3
  DO I = 1,(M-1)
    Z(M,I) = Z(I,M)
  END DO
END DO
! ----------------------------------------------------------------------------------------

!$acc data copyin(Z) create(nnzPerRowZ) copyout(nnzPerRowZ)
!$acc host_data use_device(Z, nnzPerRowZ)
status = cusparseDnnz_v2(h, CUSPARSE_DIRECTION_ROW, &
	          3, 3, descrA, Z, 3, nnzPerRowZ, nna) 
!$acc end host_data
!$acc end data

ALLOCATE(csrValZ(nna))
ALLOCATE(csrRowPtrZ(4))
ALLOCATE(csrColIndZ(nna))

!$acc data copy(Z, nnzPerRowZ) create(csrValZ, csrRowPtrZ, csrColIndZ) copyout(csrValZ, csrRowPtrZ, csrColIndZ)
!$acc host_data use_device(Z, nnzPerRowZ, csrValZ, csrRowPtrZ, csrColIndZ)
status = cusparseDdense2csr(h, 3, 3, descrA, Z, 3, &
	    nnzPerRowZ, csrValZ, csrRowPtrZ, csrColIndZ)
!$acc end host_data
!$acc end data

write(*,*) csrValZ
write(*,*) csrRowPtrZ
write(*,*) csrColIndZ

DO M = 1,3
  DO N = 1,3
    write(*,*) Z(M,N)
  END DO
END DO 

! 2nd part of the code-----------------Matrix-Matrix Multiplication

!$acc data copyin(csrValZ, csrRowPtrZ, csrColIndZ) copy(Y,V)
!$acc host_data use_device(csrValZ, csrRowPtrZ, csrColIndZ,Y,V)
  
status = cusparseCreateDnMat(matY, N, P, N, Y, CUDA_R_64F, CUSPARSE_ORDER_COL)
  
status = cusparseCreateDnMat(matV, N, P, N, V, CUDA_R_64F, CUSPARSE_ORDER_COL)
  
status = cusparseCreateCsr(matZ, N, N, nna, csrRowPtrZ, csrColIndZ, csrValZ, &
                   CUSPARSE_INDEX_32I, CUSPARSE_INDEX_32I, &
                   CUSPARSE_INDEX_BASE_ONE, CUDA_R_64F)
  
status = cusparseSpMM_buffersize(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
                 alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, bsize)

IF (bsize .GT. 0) buffer = acc_malloc(bsize)


status = cusparseSpMM(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
                 alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, buffer) 

  
IF (bsize.gt.0) CALL acc_free(buffer)

!$acc end host_data
!$acc end data

write(*,*) V


DEALLOCATE(csrValZ)
DEALLOCATE(csrRowPtrZ)
DEALLOCATE(csrColIndZ)

end program sptoden

The Makefile is as follows:

FC=nvfortran
TIMER=/usr/bin/time
OPT=
NOPT=-fast -Minfo=opt $(OPT)
FCFLAGS = -Mpreprocess -fast -acc=gpu -cuda -cudalib=cusparse

csr: csr.o
$(TIMER) ./csr.o $(STEPS)
csr.o: csr.f90
$(FC) $(FCFLAGS) -o $@ $< $(NOPT) -Minfo=accel -acc

clean:
rm -f *.o *.exe *.s *.mod a.out

I could not figure out why it does not work.
In fact, the transformation from symmetric sparse to general sparse is pointless in the problem. I just explained it what I wanted to do. I suppose that the problem is related to the matrix matrix multiplication.
Thanks

Yunus

MatColgrove · November 8, 2021, 4:58pm

Hi Yunus,

Check that your argument types match the function signature. In particular, “matZ” should be a “cusparseSpMatDescr” type. See: NVIDIA Fortran CUDA Library Interfaces Version 22.7 for ARM, OpenPower, x86

-Mat

yunus.altintop.2 · November 8, 2021, 5:38pm

Now I comment on the cusparseDnMatGetValue(MatV, V) part, it still gave an error:

NVFORTRAN-S-0155-Could not resolve generic procedure cusparsednmatgetvalues (csr.f90: 147)

I reposted the code as follows:

program sptoden

use openacc
use cusparse
implicit none

! CUDA API variables
type(cusparseHandle) :: h
type(cusparseMatDescr) :: descrA

INTEGER(8) :: bsize

type(c_devptr) :: buffer
type(cusparseSpMatDescr) :: matA
type(cusparseSpMatDescr) :: matZ
type(cusparseDnMatDescr) :: matY
type(cusparseDnMatDescr) :: matV

INTEGER :: status 

DOUBLE PRECISION :: alpha, beta

INTEGER, dimension(3) :: nnzPerRowZ
INTEGER :: nna

DOUBLE PRECISION, dimension(5) :: csrValA
INTEGER, dimension(5) :: csrColIndA
INTEGER, dimension(4) :: csrRowPtrA

DOUBLE PRECISION, allocatable :: csrValZ(:)
INTEGER, allocatable :: csrColIndZ(:)
INTEGER, allocatable :: csrRowPtrZ(:)
DOUBLE PRECISION, dimension(3,3) :: Z

DOUBLE PRECISION, dimension(3) :: Y, V

INTEGER :: M,I

INTEGER :: N,P

N = 3
P = 1

Y = (/ 1.0d0, 2.0d0, 3.0d0 /)

csrValA = (/ 1.0d0, 3.0d0, -2.0d0, 5.0d0, -1.0d0 /)
csrRowPtrA = (/ 1, 4, 5, 6 /)
csrColIndA = (/ 1, 2, 3, 2, 3 /)


!  The expansion of the sparse matrix A (It is supposed to be upper triangular symmetric.)
!  | 1.0   3.0  -2.0 |   |1.0| | 1.0|
!  | 3.0   5.0   0.0 | X |2.0|=|13.0|
!  |-2.0   0.0  -1.0 |   |3.0| |-5.0|
!  ---------------------------------------------------------------------------------------

Z = 0.0d0

! Transformation from symmetric to general matrix----------------------------------------- 
status = cusparseCreate(h)
if (status /= CUSPARSE_STATUS_SUCCESS) &
  write(*,*) 'cusparseCreate error for transformation: ', status
status = cusparseCreateMatDescr(descrA)
status = cusparseSetMatType(descrA, CUSPARSE_MATRIX_TYPE_GENERAL)
status = cusparseSetMatIndexBase(descrA, CUSPARSE_INDEX_BASE_ONE)
status = cusparseSetStream(h, acc_get_cuda_stream(acc_async_sync))	

!$ACC data copyin(csrValA, csrRowPtrA, csrColIndA) copy(Z)
!$acc host_data use_device(csrValA, csrRowPtrA, csrColIndA, Z)
status = cusparseDcsr2dense(h, 3, 3, descrA, csrValA, csrRowPtrA, csrColIndA, Z, 3)
!$acc end host_data
!$acc end data
! ----------------------------------------------------------------------------------------

! Conversion from non-symmetrical to symmetrical------------------------------------------ 
DO M = 2,3
  DO I = 1,(M-1)
    Z(M,I) = Z(I,M)
  END DO
END DO
! ----------------------------------------------------------------------------------------

!$acc data copyin(Z) create(nnzPerRowZ) copyout(nnzPerRowZ)
!$acc host_data use_device(Z, nnzPerRowZ)
status = cusparseDnnz_v2(h, CUSPARSE_DIRECTION_ROW, &
	          3, 3, descrA, Z, 3, nnzPerRowZ, nna) 
!$acc end host_data
!$acc end data


ALLOCATE(csrValZ(nna))
ALLOCATE(csrRowPtrZ(4))
ALLOCATE(csrColIndZ(nna))

!$acc data copy(Z, nnzPerRowZ) create(csrValZ, csrRowPtrZ, csrColIndZ) copyout(csrValZ, csrRowPtrZ, csrColIndZ)
!$acc host_data use_device(Z, nnzPerRowZ, csrValZ, csrRowPtrZ, csrColIndZ)
status = cusparseDdense2csr(h, 3, 3, descrA, Z, 3, &
	    nnzPerRowZ, csrValZ, csrRowPtrZ, csrColIndZ)
!$acc end host_data
!$acc end data

write(*,*) csrValZ
write(*,*) csrRowPtrZ
write(*,*) csrColIndZ

DO M = 1,3
  DO N = 1,3
    write(*,*) Z(M,N)
  END DO
END DO 

write(*,*) "value of nnz ", nna

! 2nd part of the code-----------------Matrix-Matrix Multiplication

!$acc data copyin(csrValZ, csrRowPtrZ, csrColIndZ) copy(Y,V)
!$acc host_data use_device(csrValZ, csrRowPtrZ, csrColIndZ,Y,V)
  
status = cusparseCreateDnMat(matY, N, P, N, Y, CUDA_R_64F, CUSPARSE_ORDER_COL)
!IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseCreateDnMat: ",status
  
status = cusparseCreateDnMat(matV, N, P, N, V, CUDA_R_64F, CUSPARSE_ORDER_COL)
!IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseCreateDnMat: ",status
  
status = cusparseCreateCsr(matZ, N, N, nna, csrRowPtrZ, csrColIndZ, csrValZ, &
                   CUSPARSE_INDEX_32I, CUSPARSE_INDEX_32I, &
                   CUSPARSE_INDEX_BASE_ONE, CUDA_R_64F)
!IF (status.ne.CUSPARSE_STATUS_SUCCESS) print *,"cusparseCreateCsr: ",status
  
status = cusparseSpMM_buffersize(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
                 alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, bsize)
!IF (status.ne.CUSPARSE_STATUS_SUCCESS) print *,"cusparseSpMM_buffersize: ",status
  
!print *,"SpMM buffersize required: ",bsize  
IF (bsize .GT. 0) buffer = acc_malloc(bsize)
  
! -------------------------------------------------------------------------------
!status = cusparseSpMM_preprocess(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
!                 alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, buffer)
! -------------------------------------------------------------------------------

status = cusparseSpMM(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
                 alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, buffer) 
!IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseSpMM: ",status	  
  
! -------------------------------------------------------------------------------
status = cusparseDnMatGetValues(matV, V)
IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseDnMatGetValues: ",status
! -------------------------------------------------------------------------------
  
IF (bsize.gt.0) CALL acc_free(buffer)

!$acc end host_data
!$acc end data

write(*,*) V


DEALLOCATE(csrValZ)
DEALLOCATE(csrRowPtrZ)
DEALLOCATE(csrColIndZ)

end program sptoden

I added dnMatGetValues for (matV, V) because the multiplication is supposed to give the results {1.0, 13.0, -5.0}.

||! | 1.0 3.0 -2.0 | |1.0| | 1.0||
||! | 3.0 5.0 0.0 | X |2.0|=|13.0||
||! |-2.0 0.0 -1.0 | |3.0| |-5.0||

But it gave {0,0,0}. So I added the dnmatgetvalues. It does not work.
The link you shared above says to use a c_devptr type instead of V but I do not know how to convert that type into a usable matrix.

Yunus

MatColgrove · November 8, 2021, 7:53pm

I think in this case, you can use “acc_deviceptr(V)” to grab the device C pointer:

 ! -------------------------------------------------------------------------------
status = cusparseDnMatGetValues(matV, acc_deviceptr(V))
IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseDnMatGetValues: ",status
! -------------------------------------------------------------------------------

Though “V” still produces zeros. Let me ask Brent to take a look to see why you may be getting incorrect results.

-Mat

bleback · November 8, 2021, 9:50pm

In cusparseSpMM(), the result is a dense matrix. A dense device matrix. It is easiest just to read the device values from the underlying dense matrix, which in this case is V.

The cusparseDnMatGetValues can return a type(c_devptr) to the device values, which should just be a pointer to V, on the device. You have to go through gyrations though to convert the type(c_devptr) to something easy to work with in Fortran, so just access V.

bleback · November 8, 2021, 9:52pm

IN other words, cusparseDnMatGetValues needs a **ptr argument to return a pointer. That’s why you have to pass it a type(c_devptr).

yunus.altintop.2 · November 8, 2021, 10:17pm

I understood your point but I cannot implement it.

I did following

type(c_devptr), dimension(:), pointer :: vptr
DOUBLE PRECISION, dimension(3), target :: V
vptr => V

then keep copy V using ACC routines and added cusparseDnMatGetValues(matV, vptr).
Still does not work. Can you show how to do it?

bleback · November 8, 2021, 10:37pm

status = cusparseSpMM(h, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_NON_TRANSPOSE, &
alpha, matZ, matY, beta, matV, CUDA_R_64F, CUSPARSE_CSRMM_ALG1, buffer)
IF (status.ne.CUSPARSE_STATUS_SUCCESS) PRINT *,"cusparseSpMM: ",status

IF (bsize.gt.0) CALL acc_free(buffer)

!$acc end host_data
!$acc end data

write(,) V

You’ve already setup V to be the matrix beneath the dense matrix descriptor, so just read from V after the end data. You also have bugs in your code, need to set alpha and beta, and don’t use M and N as loop indices.

yunus.altintop.2 · November 8, 2021, 10:43pm

Thanks, it fixed it.

system · November 22, 2021, 10:43pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
cuSPARSE Library with OpenACC data Directives: cusparseDnVecGetValues not resolvable Legacy PGI Compilers cuda	6	811	October 11, 2021
Some questions for the function "cusparseDcsrsv2_solve" nvc, nvc++ and nvfortran cuda	12	42	August 9, 2024
Problem in basic dense to csr format conversion using CUSPARSE GPU-Accelerated Libraries	3	922	July 28, 2015
CuSPARSE MM Multiplication: Preprocess and SPMM_CSR_ALG3 Error nvc, nvc++ and nvfortran cuda	1	654	August 17, 2021
Problem using cusparseScsric0 function GPU-Accelerated Libraries	22	6907	November 29, 2012
cusparseScsrmv transpose mode is not working CUDA Programming and Performance	17	1469	July 9, 2018
CUSPARSE conversion routines not working... cusparseSnnz and cusparseSdense2csr misbehaving... CUDA Programming and Performance	11	4156	February 28, 2011
Problems in the use of cusparseSpGEMM in CUDA Fortran GPU-Accelerated Libraries cusparse	16	1351	October 25, 2023
cusparseScsrilu02 breaks with large matrices GPU-Accelerated Libraries cublas , cusolver , cusparse	10	1141	June 1, 2022
CuSolver Sparse on Fortran GPU-Accelerated Libraries cusolver	6	1429	April 28, 2024

cuSPARSE generic procedure could not be resolved NVFORTRAN-S-0155

Related topics