Would anyone have an example of using PGI accelerator directives in Fortran 90 used in conjunction with a cublas dgemv call?
From searching the forum, I believe this is possible. But was not able to find any examples.
I’d imagine that it would be possible to place the array data on the device using the data region directives, placing a call to cublas degemv, and then doing a copy out.
The PGI Accelerator Model will recognise CUDA Fortran device variables. So here, you would call CUBLAS degemv using CUDA Fortran and then just use the device array in the compute region. No need to use a data region.
Note the CUDA Fortran SDK has an example of calling sgemm. (/opt/pgi/linux86-64/2012/cuda/CUDA-Fortran-SDK/cublasTestSgemm.F90).