Hi, I’m trying to convert a call to the LAPACK function Dtrtri into cuSolverDnXtrtri to utilize the GPU. As I learned from an old topic, the “X” version is the only one available since CUDA 11.4 onwards (I’m using CUDA 11.8).
The issue is that I keep getting an error status returned from the function. According to the documentation, this should indicate that A(i,i), with info=i, is 0. But I’m sure it is not.
I believe the problem lies in the datatype argument (I struggled a lot to satisfy the compiler with it).
Here is an example of how I use the function:
use cublas
use cusolverdn
use cudafor
implicit none
type(cusolverDnHandle) :: handle
integer :: n,istat
integer(8) :: dlwork, hlwork
integer(1),allocatable :: hwork
integer(1),allocatable,device :: dwork
real(8), device :: C(n,n)
integer, device :: istat_d
ISTAT=CUSOLVERDNXTRTRI_BUFFERSIZE(HANDLE,CUBLAS_FILL_MODE_UPPER,CUBLAS_DIAG_NON_UNIT,N,&
&CUDADATATYPE(CUDA_R_64F),C,N,DLWORK,HLWORK)
ALLOCATE(DWORK(DLWORK),STAT=ISTAT)
ALLOCATE(HWORK(HLWORK),STAT=ISTAT)
ISTAT=CUSOLVERDNXTRTRI(HANDLE,CUBLAS_FILL_MODE_UPPER,CUBLAS_DIAG_NON_UNIT,N,&
&CUDADATATYPE(CUDA_R_64F),C,N,DWORK,DLWORK,HWORK,HLWORK,ISTAT_d)