I watched a webinar on OpenACC and code for a saxby was discussed. However, the contents of the make file and those other files utilized by the make file were not displayed. How might one obtain these codes? I believe the saxby was called ahost.f90 or ahost.c, but if I recall correctly, there was also adriver.f90 or adriver.c. My notes are not currently available, but I believe the names were similar to those referenced above. Any help would be greatly appreciated. Thank you.
-Gayle
Hi Gayle,
I sent a note to Michael asking for the example code. I’ll then put a package together and post it on pgroup.com for you.
- Mat
Hi Mat,
Thank you so much. I appreciate it.
-Gayle
By the way, where will you put the source code and makefile on the pgroup.com website?
Thanks, Mat.
-Gayle
Hi Gayle,
They can be found at:
http://www.pgroup.com/lit/samples/cexamples.tar
http://www.pgroup.com/lit/samples/fexamples.tar
Have fun!
Mat
Thanks Mat.
-Gayle
Forgive me, but I am new to some of the terminology. I was hoping to get the saxby code from the webinar, and if I look at the makefile, I do not see anything called saxby. Is “smoothing” the same thing as saxby?
Thanks.
-Gayle
Hi Gayle,
These are the ones he used from his last OpenACC webinar/training he did for ISC. I asked him about Saxpy but he doesn’t think he had a stand alone example for that one or if he did, he’s not sure where it is now.
Hence, I went ahead and wrote up a very basic Saxpy example using OpenACC. Hopefully, I’m close to what he showed. Let me know if you need more info.
- Mat
% cat saxpy.f90
subroutine saxpy (A,X,Y,N)
real(4) :: A, X(N), Y(N)
integer :: N, i
!$acc kernels
do i = 1,N
X(i) = A * X(i) + Y(i)
enddo
!$acc end kernels
end subroutine
program test
real, allocatable, dimension(:) :: X, Y
integer :: N
real :: A, X1
N=1024
A=1.012
allocate(X(N), Y(N))
call random_seed()
call random_number(X)
call random_number(Y)
print *, A, X(1), Y(1)
X1=X(1)
call saxpy(A,X,Y,N)
print *, X(1), A*X1+Y(1)
deallocate(X,Y)
end program test
% pgf90 saxpy.f90 -acc -Minfo -V12.6
saxpy:
4, Generating copyin(y(:n))
Generating copy(x(:n))
Generating compute capability 1.0 binary
Generating compute capability 2.0 binary
5, Loop is parallelizable
Accelerator kernel generated
5, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
CC 1.0 : 8 registers; 48 shared, 0 constant, 0 local memory bytes
CC 2.0 : 10 registers; 0 shared, 64 constant, 0 local memory bytes
% a.out
1.012000 0.6050169 0.7534078
1.365685 1.365685