Dear Friends,
I am checking an example given in an article by Jeff Larkin on techniques for combining OpenACC and CUDA. I used the fortran example called openacc_cublas to check some simple call to my subroutine called “barak”. Here is the code:
!-----------------------------------------------------------------------
program main
use cublas
integer, parameter :: N = 2**20
real, dimension(N) :: X, Y
!$acc data create(x,y)
!$acc kernels
X(:) = 1.0
Y(:) = 0.0
!$acc end kernels
!$acc host_data use_device(x,y)
call barak(N, 2.0, x, 1, y, 1)
!$acc end host_data
!$acc update self(y)
!$acc end data
print *, y(1)
end program
subroutine barak(n,c,x,y)
real, dimension(N) :: X, Y
y=c*x
return
end
!------------------------------------------------------------------------------
I compile the code as in the original example.
export CUDA_HOME=/usr/local/cuda-7.0/
pgfortran openacc_barak.o -L$CUDA_HOME/lib64 -lcudart -Mcuda -fast -acc -ta=nvidia -Minfo=accel
But when I run it I get:
Segmentation fault (core dumped)
Could anyone guide what is wrong and how to fix the code?
Thanks in advance,
Barak