How to store data when using cusparseDgtsvInterleavedBatch()

Hello!
When I try to use cusparseDgtsvInterleavedBatch() functions, data storage is adopted the same method as cusparseDgtsv2StridedBatch(). The program works, but the calculations are all wrongl.And I can’t understand Cusparse Library instructions about cusparseDgtsvInterleavedBatch() .
Here is my test procedure:

PROGRAM TDMA
  use openacc
  use cusparse
  implicit none

  integer, parameter :: npts = 2000 
  integer :: cusparseCreate_status
  type(cusparseHandle) :: handle
  integer :: m, batchstride, batchcount
  real(8) :: dl(npts), d(npts), du(npts), x(npts)
  integer :: i
  integer :: istat
  integer(8) :: bufferSizeInBytes
  integer(1), pointer :: buffer(:)
  integer :: algo=0
  real::t1,t2,t
  
  !$acc data create(dl,d,du,x)
  m = 200
  batchcount=10
  !batchstride=200
  dl = 1.0
  d = 2.0
  du = 1.0 
 
  !do i=1, 200, 8000000    
  !  dl(i) = 0.0
  !end do
  
  !do i=200,200,8000000 
  !  du(i) = 0.0
  !end do
  dl(1)=0.0
  du(2000)=0.0
  do i = 1, 2000
    x(i) = i
  end do

  !$acc update device(dl,d,du,x)
  call cpu_time(t1)
  
  cusparseCreate_status = cusparseCreate(handle)  
  print *, 'CREATE cusparseCreate_status: '
  if (cusparseCreate_status == CUSPARSE_STATUS_SUCCESS) then
    print *, 'CUSPARSE_STATUS_SUCCESS'
  elseif (cusparseCreate_status == CUSPARSE_STATUS_NOT_INITIALIZED) then
    print *, 'CUSPARSE_STATUS_NOT_INITIALIZED'
  elseif (cusparseCreate_status == CUSPARSE_STATUS_ALLOC_FAILED) then
    print *, 'CUSPARSE_STATUS_ALLOC_FAILED'
  elseif (cusparseCreate_status == CUSPARSE_STATUS_ARCH_MISMATCH) then
    print *, 'CUSPARSE_STATUS_ARCH_MISMATCHED'
  end if

  istat = cusparseDgtsvInterleavedBatch_bufferSizeExt(handle, algo, m, dl, d, du, x, batchcount, bufferSizeInBytes)
  !!!!! istat = cusparseDgtsv2StridedBatch_bufferSizeExt(handle, m, dl, d, du, x, batchcount, batchstride,bufferSizeInBytes)
  allocate(buffer(bufferSizeInBytes))
!$acc data create(buffer)
 call cpu_time(t1)
  !!!!!!  istat = cusparseDgtsv2StridedBatch(handle, m, dl, d, du, x, batchcount, batchstride, buffer)
  istat=cusparseDgtsvInterleavedBatch(handle, algo, m, dl, d, du, x, batchcount, buffer)
!$acc end data
    print *, 'Dgtsv STATUS: '
  if (istat == CUSPARSE_STATUS_SUCCESS) then
    print *, 'CUSPARSE_STATUS_SUCCESS'
  elseif (istat == CUSPARSE_STATUS_NOT_INITIALIZED) then
    print *, 'CUSPARSE_STATUS_NOT_INITIALIZED'
  elseif (istat == CUSPARSE_STATUS_ALLOC_FAILED) then
    print *, 'CUSPARSE_STATUS_ALLOC_FAILED'
  elseif (istat == CUSPARSE_STATUS_INVALID_VALUE) then
    print *, 'CUSPARSE_STATUS_INVALID_VALUE'
  elseif (istat == CUSPARSE_STATUS_ARCH_MISMATCH) then
    print *, 'CUSPARSE_STATUS_ARCH_MISMATCHED'
  elseif (istat == CUSPARSE_STATUS_EXECUTION_FAILED) then
    print *, 'CUSPARSE_STATUS_EXECUTION_FAILED'
  elseif (istat == CUSPARSE_STATUS_INTERNAL_ERROR) then
    print *, 'CUSPARSE_STATUS_INTERNAL_ERROR'
  end if
  call cpu_time(t2)
  t=t2-t1
  print*,'The computing time is:', t 
  !$acc update host(dl,d,du,x)
  
  print *, 'The solution is: '
  do i = 1, 2000
    print *, 'SOL1(', i, '):', x(i)
  end do
  
  
  !$acc end data

END PROGRAM TDMA

How can I correct the way the matrix is stored to calculate the correct result? Looking forward to your reply.Thank you so much!

With interleaved storage, in Fortran, the arrays are treated as dl(batchcount,m), d(batchcount,m), etc. When you initialize the dl and du array elements that are “outside” of the matrix, you should do something like:
dl(1:batchcount)=0.0
du(npts-batchcount+1,npts)=0.0
A few other things. There could be a race condition between the OpenACC inner data region, and the cuSolver function which runs asynchronously wrt the host. So, I would probably insert a sync point there, either OpenACC wait or CUDA device synchronize.
Also, it is a feature of nvfortran to pass the device address to CUDA library functions we have written the interfaces for, using the CUDA Fortran device attribute, when we know we are inside a structured OpenACC data region. It is proper OpenACC to put the device arguments, dl, d, du, x, in a host_data use_device directive.