Dear Nvidia users,
I’m using OpenMP offload with NVHPC 21.9. I would like to use cublas routine and TARGET routine in a TARGET DATA region, by using use_device_prt and is_Device_prt:
!$OMP TARGET DATA MAP(TOFROM:xbar,bbar,b,alpha) MAP(TO:xx,bb,w) !$OMP& use_device_ptr(xbar,xx,bb,bbar,b,w) do k = 2,m alpha_d = alpha(k) call cublasDaxpy(n, alpha_d, xx(:,k), 1, xbar, 1) call cublasDaxpy(n, alpha_d, bb(1,k), 1, bbar, 1) call cublasDaxpy(n, -alpha_d, bb(1,k), 1, b, 1) enddo do k = 1, m alpha(k) = vlsc3_omp(xx(1,k),w,b,n) enddo
!$OMP END TARGET DATA
Where vlsc3_omp is:
dimension x(n),y(n),b(n) real dt dt = 0.0 !$OMP TARGET TEAMS LOOP REDUCTION(+:dt) is_device_ptr(x,y,b) !$OMP& MAP(TOFROM:dt) do i=1,n dt = dt+x(i)*y(i)*b(i) enddo vlsc3_omp = dt return end
The code crashes into the above subroutine. I don’t understand the reason. All arrays are on GPU since are defined use_device_prt, and the subroutine vlsc3_omp run on GPU. What I’m doing wrong? Thanks