When I use the directive “!$acc do kernel” for my loops which is supposed to run on Accelerator, there is no output during the compiling process telling whether my code is parallelizable or not.
Example
!$acc do kernel
DO i = 1,N
Ab(i,1) = Ab(i,1) + 1/dt
DO j=1, maxnklm+maxnklp-1
compP(j,i) = compPdt(i,j)*dt + compP(j-1,i)
IF (flag.EQ.1 .AND. compP(j,i) .GE. X(i)) THEN
isfu(i) = indexPdt(i,j)
flag = 0
ENDIF
ENDDO
ENDDO
I used
-ta=nvidia,3.0,cc13 -Minfo=accel -v
However, when I try with “!$acc region do kernel”, it compile successfully. Nevertheless, it has runtime error
call to cuMemcpyDtoH returned error 700: Launch failed
I have no idea what cause this error, as the data is small, and they are all allocated. I hope
- some one can give me a hint
- the compiler should be able to tell which variable cause this error (in Debug mode).
Thanks,
Tuan