Unable to compile and use the cuSPARSE Dgtsv function in CUDA FORTRAN

Hello all,
I am trying to use cuSPARSE Dgtsv function to accelerate my tridiagonal Solving part in my code and I am currently trying out the function in an example program which I have attached below.
The compile command is

pgf90 -Mcudalib=cusparse -o test_cuda_cusparse.exe cuda_tdma.cuf

and i get this error

PGF90-S-0084-Illegal use of symbol cusparsedgtsv - attempt to CALL a FUNCTION (cuda_tdma.cuf: 45)
PGF90-S-0450-Argument number 4 to cusparsedgtsv: kind mismatch (cuda_tdma.cuf: 45)
PGF90-S-0450-Argument number 5 to cusparsedgtsv: kind mismatch (cuda_tdma.cuf: 45)
PGF90-S-0450-Argument number 6 to cusparsedgtsv: kind mismatch (cuda_tdma.cuf: 45)
PGF90-S-0450-Argument number 7 to cusparsedgtsv: kind mismatch (cuda_tdma.cuf: 45)
0 inform, 0 warnings, 5 severes, 0 fatal for tdma

The whole code is

   !********** cuda_tdma.cuf ************!

PROGRAM TDMA
	
	use iso_C_binding
	use cudafor
	use cusparse
	implicit none
	
	!	 atri - sub-diagonal (means it is the diagonal below the main diagonal)
	!	 btri - the main diagonal
	!	 ctri - sup-diagonal (means it is the diagonal above the main diagonal)
	!	 dtri - right part
	!	 npts - number of equations
 
	integer npts,i,istat
	parameter (npts=3)
	real,target, dimension(npts) :: atri,btri,ctri,dtri
	real, device :: dl(npts), d(npts), du(npts), B(npts)
	type(cusparseHandle) :: handle
		
	!***** TEST VALUES INITIALIZATION *****!
	
	atri(1)=0	
	atri(2)=-1
	atri(3)=-1
	btri(1)=3
	btri(2)=3
	btri(3)=3
	ctri(1)=-1
	ctri(2)=-1
	ctri(3)=0
	dtri(1)=-1
	dtri(2)=7
	dtri(3)=7
	
	!***** memcpy() HtoD *****!
	
	dl  = atri
	d   = btri
	du  = ctri
	B   = dtri
	
	!***** Calling Dgtsv *****!
	
	call cusparseDgtsv(handle,npts,1,dl,d,du,B,1)
		
	!***** memcpy() DtoH *****!

	dtri = B

	!***** Printing solution *****!

	print*,'The solution is: '
	do i=1,npts
		print*,'X(',i,'):',dtri(i)
	end do
	
 
 END PROGRAM TDMA

Can someone sort this out ?

Hi SIDARTH N,

From: PGI Fortran CUDA Library Interfaces Version 19.5 for x86 and NVIDIA Processors

here’s the interface for Dgtdv:

integer(4) function cusparseDgtsv(handle, m, n, dl, d, du, B, ldb)
type(cusparseHandle) :: handle
integer(4) :: m, n, ldb
real(8), device :: dl(), d(), du(), B()

So in your program you have two issues. You’re trying to call a function and using “real” when the routine expects “real(8)” data types.

To fix:

istat=cusparseDgtsv(handle,npts,1,dl,d,du,B,1)

Then either compile with “-r8” to promote the default kind to real(8), declare the arrays as “real(8)”, or use the single precision version “cusparseSgtsv”.

Hope this helps,
Mat

Hey Mat,
I made the changes you suggested the number of error has reduced. thanks for that. But I am still left with 1 severe which is :
which is

PGI$ pgf90 -Mcudalib=cusparse -r8 -o test_cuda_cusparse.exe cuda_tdma.cuf
PGF90-S-0034-Syntax error at or near identifier cusparsedgtsv (cuda_tdma.cuf: 19)
0 inform, 0 warnings, 1 severes, 0 fatal for tdma

Can you please take a look at it ?
Here is the modified code:

   !********** cuda_tdma.cuf ************!

PROGRAM TDMA
	
	use iso_C_binding
	use cudafor
	use cusparse
	implicit none
	
	!	 atri - sub-diagonal (means it is the diagonal below the main diagonal)
	!	 btri - the main diagonal
	!	 ctri - sup-diagonal (means it is the diagonal above the main diagonal)
	!	 dtri - right part
	!	 npts - number of equations
 
	integer npts,i,istat
	parameter (npts=3)
	real(8),dimension(npts) :: atri,btri,ctri,dtri
	integer(4) function cusparseDgtsv(handle, m, n, dl, d, du, B, ldb)
	type(cusparseHandle) :: handle
	integer(4) :: m, n, ldb
	real(8), device, dimension(npts) :: dl,d,du,B
		
	!***** TEST VALUES INITIALIZATION *****!
	
	atri(1)=0	
	atri(2)=-1
	atri(3)=-1
	btri(1)=3
	btri(2)=3
	btri(3)=3
	ctri(1)=-1
	ctri(2)=-1
	ctri(3)=0
	dtri(1)=-1
	dtri(2)=7
	dtri(3)=7
	
	!***** memcpy() HtoD *****!
	
	dl  = atri
	d   = btri
	du  = ctri
	B   = dtri
	m   = npts
	n   = 1
	ldb = 1
	
	!***** Calling Dgtsv *****!
	
	istat = cusparseDgtsv(handle,m,n,dl,d,du,B,ldb)
		
	!***** memcpy() DtoH *****!

	dtri = B

	!***** Printing solution *****!

	print*,'The solution is: '
	do i=1,npts
		print*,'X(',i,'):',dtri(i)
	end do
	
 
 END PROGRAM TDMA

Regards,
Sidarth N

Comment out the line:

integer(4) function cusparseDgtsv(handle, m, n, dl, d, du, B, ldb)

Hello Mat,
I have made the changes you suggested and it complies all good. Thanks for that, but I am observing a faulty output. So I checked the status of the operations and saw this in my output:

CREATE cusparseCreate_status:
CUSPARSE_STATUS_SUCCESS
Dgtsv STATUS:
CUSPARSE_STATUS_INVALID_VALUE

I have inserted all the details needed in the code including the expected output:

   !********** cuda_tdma.cuf ************!

PROGRAM TDMA
	
use iso_C_binding
use cudafor
use cusparse
implicit none

!	 atri - sub-diagonal (means it is the diagonal below the main diagonal)
!	 btri - the main diagonal
!	 ctri - sup-diagonal (means it is the diagonal above the main diagonal)
!	 dtri - right part
!	 npts - number of equations
 
integer(4) npts,i,istat
parameter (npts=31)
real(8),dimension(npts) :: atri,btri,ctri,dtri
integer(4)	cusparseCreate_status
type(cusparseHandle) :: handle 
integer(4) :: m, n, ldb
real(8), device, dimension(npts) :: dl,d,du
real(8), device, dimension(1,npts) :: B
!***** TEST VALUES INITIALIZATION *****!
	
atri       = 1.0	
atri(1)    = 0.0

btri       = 2.0
	
ctri       = 1.0
ctri(npts) = 0.0

!***** dtri = transpose of (1,2,3,4,....16,15,14,13,12,....1) *****!	
do i=1,16			
dtri(i)=i
dtri(32-i)=i    
enddo	
	
!***** memcpy() HtoD *****!
	
dl  = atri
d   = btri
du  = ctri
B(1,:)   = dtri(:)
m   = npts
n   = 1
ldb = 1
	
!**** cusparse_create and check ****!
	
cusparseCreate_status = cusparseCreate(handle) 
	
print*,'CREATE cusparseCreate_status: '
if(cusparseCreate_status.eq.CUSPARSE_STATUS_SUCCESS) print*,'CUSPARSE_STATUS_SUCCESS'
if(cusparseCreate_status.eq.CUSPARSE_STATUS_NOT_INITIALIZED) 		 
print*,'CUSPARSE_STATUS_NOT_INITIALIZED'
if(cusparseCreate_status.eq.CUSPARSE_STATUS_ALLOC_FAILED) 
 print*,'CUSPARSE_STATUS_ALLOC_FAILED'
if(cusparseCreate_status.eq.CUSPARSE_STATUS_ARCH_MISMATCH) 
 print*,'CUSPARSE_STATUS_ARCH_MISMATCHED'
	
!***** Calling Dgtsv and checking the output Status *****!
istat = cusparseDgtsv(handle,m,n,dl,d,du,B,ldb)
	
print*,'Dgtsv STATUS: '
if(istat.eq.CUSPARSE_STATUS_SUCCESS) print*,'CUSPARSE_STATUS_SUCCESS'
if(istat.eq.CUSPARSE_STATUS_NOT_INITIALIZED) print*,'CUSPARSE_STATUS_NOT_INITIALIZED'
if(istat.eq.CUSPARSE_STATUS_ALLOC_FAILED) print*,'CUSPARSE_STATUS_ALLOC_FAILED'
if(istat.eq.CUSPARSE_STATUS_INVALID_VALUE) print*,'CUSPARSE_STATUS_INVALID_VALUE'
if(istat.eq.CUSPARSE_STATUS_ARCH_MISMATCH) print*,'CUSPARSE_STATUS_ARCH_MISMATCHED'
if(istat.eq.CUSPARSE_STATUS_EXECUTION_FAILED) print*,'CUSPARSE_STATUS_EXECUTION_FAILED'
if(istat.eq.CUSPARSE_STATUS_INTERNAL_ERROR) print*,'CUSPARSE_STATUS_INTERNAL_ERROR'
		
!***** memcpy() DtoH *****!

dtri(:) = B(1,:)

!***** Printing solution *****!
	
print*,'The solution is: ' 			
!***** Expexted Solution = transpose of (0,1,0,2,0,3,...,0,8,0,7,0,6,...,0,1,0) ****!
do i=1,npts
print*,'SOL(',i,'):',dtri(i)
end do
 
 END PROGRAM TDMA

The output is printing the same array as “dtri” which indicates that the output is faulty.

Now I searched on this error and from what I know I have satisfied all the criteria needed to avoid this error which is:

CUSPARSE_STATUS_INVALID_VALUE = invalid parameters were passed (m< 3, n< 0).

And my M and N are:

m=31 and n=1

Can you please sort this out for me?

Regards,
Sidarth N

Hi Sidarth N

“B” should be:

real(8), device, dimension(npts,1) :: B

The first dimension is the number of points while the second is the number of batches. After fixing this, the code runs correctly.

Note when researching this, I see that gtsv has been deprecated in CUDA 10.1 and that you should look at using gtsv2 for future versions of cuSparse. Though, we haven’t add the gtsv2 version to our cusparse module file yet so you may need to use your own interface until then.

See:

https://docs.nvidia.com/cuda/archive/10.1/cusparse/index.html#unique_1438740657

-Mat

Hey Mat,
The code worked fine thanks for that. I will take a look into the documentation for further updates on gtsv2.