cudaGetLastError() returns 11

Hi PGI engineers,

I have my big, long subroutine, the latest and the greatest is that I managed to make it work, it compiles with no problems.
when I decided to implement some error checking just to make sure everything runs great, I have the following problem:

istat = cudaGetLastError() returns a non-zero value of 11!
I did the investigation and digging into the problem and found out inside NVIDIA website that istat = 11 is happening because (This indicates that one or more of the parameters passed to the API call is not within an acceptable range of values)

I also checked further to see what could be the cause, could it be because that I am passing a pointers to matrices (total of 6 matrices) of 9900 elements each??

here is the call kernel arguments:

call GetReynVar_kernel<<<grid,threads>>>(nx,ny,ndx,ndy,pDev, & hnewDev,himinDev,himaxDev,cohimxDev,s,l,kd,zdatLowDev,qndatLowDev, &zdatMidDev,qndatMidDev,zdatHighDev,qndatHighDev,qniDev)

istat = cudaGetLastError()
if (istat .ne. 0) print*, cudaGetErrorString(istat)


the kernel subroutine:
!===============================================
attributes (global) subroutine GetReynVar_kernel (nx,ny,ndx,ndy, &
p,hnew,himin,himax,cohimx,s,l,kd,zdatLow,qndatLow, &
zdatMid,qndatMid,zdatHigh,qndatHigh,qni)
!===============================================

implicit none
integer :: i, j, k
integer, value :: nx,ny,ndx,ndy,s,l,kd
real(8) :: qni(nx,ny)
real(8) :: zdatLow(s), qndatLow(s) <=== s = 9900
real(8) :: zdatMid(l), qndatMid(l) <=== l = 9900
real(8) :: zdatHigh(kd), qndatHigh(kd) <=== kd = 9900
real(8) :: p(nx,ny)
real(8) :: hnew(ndx,ndy),himin(ndx,ndy),himax(ndx,ndy),cohimx(ndx,ndy)
real(8) :: AIJDev, ALZDev, C11Dev, C12Dev, C22Dev, DELDev, &
EVDev, F1Dev, F2Dev, FODev, H2BARDev, H3BARDev, HBARDev
real(8) :: VDev, V1Dev,V2Dev,V3Dev, V4Dev, V5Dev, XIDev, z0Dev
integer, dimension (400,400) :: locDev
real(8), dimension (400,400) :: QNDev, qn1Dev, qn2Dev,Q2Dev,PNDev, &
qn11Dev, qn22Dev,himDev, pimDev, sliderplaneheightDev,ZDev,z1Dev,z2Dev, &
R1Dev, S1Dev, U1Dev
real(8), dimension(20) :: ADev, BDev

i = (blockidx%x - 1) * blockDim%x + threadidx%x
j = (blockidx%y - 1) * blockDim%y + threadidx%y
hbarDev = 1.d0
h2barDev = 1.d0
h3barDev = 1.d0
if(((i >=2 ) .AND. (i <= nx)) .AND. ((j >=2) .AND. (j <= ny-1))) then

locDev(i,j) = floor((zDev(i,j) - zBaseLowDev) / zStepLowDev) + 1;
z1Dev(i,j)=zdatLow(locDev(i,j));

endif
end subroutine GetReynVar_kernel




please help!

Dolf

Hi Dolf,

I also checked further to see what could be the cause, could it be because that I am passing a pointers to matrices (total of 6 matrices) of 9900 elements each??

I wouldn’t think so. As long as they are device pointers, then the size of the array they point to doesn’t matter.

One thing that just came to me, is “GetReynVar_kernel” in a module or have an explicit interface? Global device routines are required to have an interface and these type of odd run time errors can occur if you don’t.

  • Mat

still giving me the same error, any other ideas?

Sorry, no. Can you send me the code?

  • Mat