Hi,
I am trying to prevent my code from crashing do to values that are not admissible. I wanted to check the values of the array within the kernel. If it is inadmissible:
- Assign an error code
- Have the kernel exit and return to the host
- Write an appropriate error message then stop the program.
Here is a OVER simplification of the kernel that I want to check. I have used fixed sized arrays here, but in the actual code, the arrays are allocated.
Module DeviceCode
Use cudafor
Implicit None
Integer:: ErrorCode_h, istat
Integer, Device:: ErrorCode
Contains
Attributes(Global) Subroutine KernelwOBError(Array_k)
Implicit None
Integer:: i, idx
Integer, Intent(INOUT):: Array_k(4)
i = threadIdx%x
Array_k(i) = i
If (Array_k(i) > 4) Then
ErrorCode = 100
! Force Exit of kernel???
End If
End Subroutine
End Module
Program EarlyExit
Use DeviceCode
Use cudafor
Implicit None
Integer:: i
Integer:: Array_h(4)
Integer, Device:: Array_d(4)
Array_h = 0
Array_d = Array_h
ErrorCode = 0
Call KernelwOBError<<<1,5>>>(Array_d)
istat = cudaDeviceSynchronize()
ErrorCode_h = ErrorCode
If (ErrorCode_h .ne. 0) Then
istat = cudaGetLastError()
Write(*,'(" ErrorCode = ", I5)') ErrorCode_h
Write(*,*) "Out Of Bounds Access"
If (istat .ne. 0) Then
Write(*,*) "Error in Kernel : ", cudaGetErrorString(istat)
End If
Stop
End If
Array_h = Array_d
Do i = 1, 4
Write(*,'(" Array(",I4,") = ", I5)') i, Array_h(i)
End Do
Write(*,*) "Normal Completion"
End Program
In this simple code, the program does not actually crash on the device side. But in my actual code it does. I have tried to keep the sample as simple as possible. Really, all I am looking for is a simple way to force the device kernel to exit early.
Thank you for any help,
Kirk