Inconsistent runtime error in device-to-host copy

joe.steinberg · June 9, 2010, 5:06pm

Hi,

I have recently run into a strange bug in the program I am working on. The error occurs at a device-to-host array assignment, but it does not happen every time I run the program. The code for the relevant subroutine is below:

subroutine CubicInterpVec3D(coords, result)

		real(real_kind), dimension(:,:) :: coords
		real(real_kind), dimension(:) :: result

		integer :: nCoords, dimGrid, dimBlock
		real(real_kind), device, allocatable, dimension(:,:) :: coordsDev
		real(real_kind), device, allocatable, dimension(:) :: resultDev

		if(allocFlag==1) then

			nCoords = size(result)
			if(size(coords,1) .ne. nCoords) then
				print *, 'Number of coordinates is not equal to the number of desired interpolated values!'
				stop 'Program terminated by cubic_bspline_interp_3D_mod:CubicVec3D'
			endif

			print *, 'Attempting to allocate device memory...'
			allocate( coordsDev(nCoords, 3), resultDev(nCoords) )
			
			print *, 'Attempting to copy test points to device...'			
			coordsDev = coords(1:nCoords, 1:3)

			print *, 'Attempting to call the kernel...'
			dimBlock = 16
			dimGrid = max(1,nCoords/dimBlock+1)
			call CubicInterpVec3D_kernel<<<dimGrid,dimBlock>>>(coordsDev,resultDev,nCoords)

			print *, 'Attempting to copy results back to host...'
			result=resultDev(1:nCoords)
			!istat = cudaMemcpy(result,resultDev,nCoords)

			print *, 'Deallocating device memory...'
			deallocate(coordsDev,resultDev)
		else
			print *, 'Coefficient matrix not allocated on device yet!'
			stop 'Program terminated by cubic_bspline_interp_3D_mod:CubicInterpVec3D'
		endif

end subroutine CubicInterpVec3D

The error occurs at

result=resultDev(1:nCoords)

As you can see I have also tried using cudaMemcpy, but the same intermittent error shows up. The error is:

copyout Memcpy (host=0x16edf00, dev=0x1f94b00, size=200) FAILED:4

I am running a 9800GT on 64-bit Ubuntu Linux. Any help would be appreciated. I can post the full code if anyone needs it, but it doesn’t seem to be relevant as the error shows up at the very end of the program, after all the kernel calls and other stuff.

MatColgrove · June 9, 2010, 8:09pm

Hi Joe,

My best guess is that your kernel is failing. Try adding the following code just after the call to your kernel to see if I’m correct.

! Check for errors
    errCode = cudaGetLastError()
    if (errCode .gt. 0) then
       print *, cudaGetErrorString(errCode)
    endif

Mat

joe.steinberg · June 9, 2010, 8:41pm

No errors are reported. Also, I tried running the program in emulator mode and it causes a segmentation fault at exactly that point every time (so the inconsistency issue is not there in emulator mode).

joe.steinberg · June 10, 2010, 4:41pm

In case anyone cares, it was a stupid typo in the kernel after all that caused a write outside a device array’s bounds. It didn’t trigger a kernel error though. It’s odd that this caused a segfault consistently in emulation mode but only once in a while in GPU mode.

Topic		Replies	Views
Confusion whilst copying from host to device Legacy PGI Compilers	2	2441	July 4, 2012
Unable to copyMem back to host Copy filled matrix back to host CUDA Programming and Performance	2	622	July 13, 2011
Problem with parameter CUDA Programming and Performance	2	624	April 30, 2017
unspecified launch failure Legacy PGI Compilers	7	8971	February 11, 2011
strange problem accessing device memory cudaMalloc and cudaMemcpy CUDA Programming and Performance	0	2320	April 2, 2010
Can't copy device memory to host memory CUDA Programming and Performance	2	3166	June 10, 2009
program crash when copying from device to host <br /> CUDA Programming and Performance	11	1977	March 31, 2009
array copy from device to device Legacy PGI Compilers	1	3981	November 3, 2010
Copy array from host to device in CUDA CUDA Programming and Performance	1	4612	February 28, 2017
Array copy cuda program copy array from Host to GPU CUDA Programming and Performance	2	3525	September 17, 2016

Inconsistent runtime error in device-to-host copy

Related topics