I took an example of data transfer between Host and Device for CUDA Fortran and found this:
program incTest use cudafor use simpleOps_m implicit none integer, parameter :: n = 256 integer :: a(n), b, i integer, device :: a_d(n) a = 1 b = 3 a_d = a call inc<<<1,n>>>(a_d, b) a = a_d if (all(a == 4)) then write(*,*) 'Success' endif end program incTest
module simpleOps_m contains attributes(global) subroutine inc(a, b) implicit none integer :: a(:) integer, value :: b integer :: i i = threadIdx%x a(i) = a(i)+b end subroutine inc end module simpleOps_m
The expected outcome is the console presenting “Success”, but this did not happen. Nothing did, nothing errors or messages.
OS: Linux - Ubuntu 16
PGI to compile
Commands to compile:
pgf90 -Mcuda -c Device.cuf pgf90 -Mcuda -c Host.cuf pgf90 -Mcuda -o HostDevice Device.o Host.o ./HostDevice
I tried other examples and they did not work too.
I tried using simple Fortran (.f90) code with the same commands to compile and it works!
How can I fix this problem?