GPU module not executing


I am relatively new to cuda and I can get my code to compile and run, but it does not appear to be executing the module on the GPU. I have an example here taken from “CUDA Fortran for scientists and engineers”, and it is not executing on the GPU:

module simpleOps_m
	attributes(global) subroutine increment(a, b)

		implicit none
		integer , intent(inout) :: a(:)
		integer , value :: b
		integer :: i

		i = threadIdx%x
		a(i) = a(i)+b

	end subroutine increment
end module simpleOps_m

program incrementTestGPU
	use cudafor
	use simpleOps_m
	implicit none
	integer , parameter :: n = 256
	integer :: a(n), b
	integer , device :: a_d(n)

	a_d = a

	call increment <<<1,n>>>(a_d , b)

	a = a_d

	write(*,*) "a = ", a

	if (any(a /= 4)) then
		write(*,*) "**** Program Failed ****"
		write(*,*) "Program Passed"
end program incrementTestGPU

I place the code in a file named test2.f95 and run the compiler command “pgf90 -Mcuda test2.f95”. However, when I run the program all the values of a are still 1 and I get program failed. I appears the part on the GPU is not running. I am using PGI community edition on ubuntu 16.04.

Any idea what could be wrong?



I think is was because I needed -Mcuda=cc60

Yes - cc60 and cc20 are not added by default. You have to manually add them.