Force kernal to terminate

Hi,

I am trying to prevent my code from crashing do to values that are not admissible. I wanted to check the values of the array within the kernel. If it is inadmissible:

  • Assign an error code
  • Have the kernel exit and return to the host
  • Write an appropriate error message then stop the program.

Here is a OVER simplification of the kernel that I want to check. I have used fixed sized arrays here, but in the actual code, the arrays are allocated.

Module DeviceCode

Use cudafor

Implicit None

Integer:: ErrorCode_h, istat
Integer, Device:: ErrorCode

Contains

	Attributes(Global) Subroutine KernelwOBError(Array_k)
	
	Implicit None

	Integer:: i, idx
	Integer, Intent(INOUT):: Array_k(4)

	i = threadIdx%x
	
	Array_k(i) = i
	If (Array_k(i) > 4) Then
		ErrorCode = 100
		! Force Exit of kernel???
	End If

	End Subroutine

End Module

Program EarlyExit

	Use DeviceCode
	Use cudafor

	Implicit None 

	Integer:: i
	Integer:: Array_h(4)
	Integer, Device:: Array_d(4)

	Array_h = 0
	Array_d = Array_h
	ErrorCode = 0

	Call KernelwOBError<<<1,5>>>(Array_d)
	istat = cudaDeviceSynchronize()
	ErrorCode_h = ErrorCode
	If (ErrorCode_h .ne. 0) Then
		istat = cudaGetLastError()
		Write(*,'(" ErrorCode = ", I5)') ErrorCode_h	
		Write(*,*) "Out Of Bounds Access"
		If (istat .ne. 0) Then
			Write(*,*) "Error in Kernel : ", cudaGetErrorString(istat)			
		End If
		Stop
	End If
	Array_h = Array_d
	
	Do i = 1, 4
		Write(*,'(" Array(",I4,") = ", I5)') i, Array_h(i)	
	End Do

	Write(*,*) "Normal Completion"

End Program

In this simple code, the program does not actually crash on the device side. But in my actual code it does. I have tried to keep the sample as simple as possible. Really, all I am looking for is a simple way to force the device kernel to exit early.

Thank you for any help,

Kirk

Hi Kirk,

If you put a STOP statement in your kernel then it will cause a runtime abort and print out which block and thread triggered the abort. The CUDA runtime’s abort call prints out some text as well. However, you wont be able to copy back the error code.

Note that we don’t have STOP working with the new LLVM back-end quite yet, so you’ll need to add the flag “-Mcuda=nollvm”.

% cat cuf_stop.cuf
Module DeviceCode

 Use cudafor

 Implicit None

 Integer:: ErrorCode_h, istat
 Integer, Device:: ErrorCode

 Contains

    Attributes(Global) Subroutine KernelwOBError(Array_k)

    Implicit None

    Integer:: i, idx
    Integer, Intent(INOUT):: Array_k(4)

    i = threadIdx%x

    Array_k(i) = i
    If (Array_k(i) > 4) Then
       STOP 100
    End If

    End Subroutine

 End Module

 Program EarlyExit

    Use DeviceCode
    Use cudafor

    Implicit None

    Integer:: i
    Integer:: Array_h(4)
    Integer, Device:: Array_d(4)

    Array_h = 0
    Array_d = Array_h
    ErrorCode = 0

    Call KernelwOBError<<<1,5>>>(Array_d)
    istat = cudaDeviceSynchronize()
    istat = cudaGetLastError()
    ! ErrorCode_h = ErrorCode
    If (istat .ne. 0) Then
       Write(*,'(" ErrorCode = ", I5)') istat
       Write(*,*) "Out Of Bounds Access"
       If (istat .ne. 0) Then
          Write(*,*) "Error in Kernel : ", cudaGetErrorString(istat)
       End If
       Stop
    End If
    Array_h = Array_d

    Do i = 1, 4
       Write(*,'(" Array(",I4,") = ", I5)') i, Array_h(i)
    End Do

    Write(*,*) "Normal Completion"

 End Program


% pgf90 -Mcuda=nollvm cuf_stop.cuf; a.out
FORTRAN STOP:   100: Block (1,1,1), Thread (5,1,1)
/opt/pgi/linux86-64/15.3/include_acc/pgi_cuda_runtime.h:2686: signed char *pgf90_stop08(signed char *, const signed char *, int): block: [0,0,0], thread: [4,0,0] Assertion `FORTRAN_STOP_STATEMENT` failed.
 ErrorCode =    59
 Out Of Bounds Access
 Error in Kernel :
 device-side assert triggered
Warning: ieee_inexact is signaling
FORTRAN STOP
  • Mat

Will this work in PVF v14.9?

No, sorry. It’s new in 15.1.