Undefined reference to `__pgi_dev_cublassasum'

I am working my way through the examples in the May 2022 edition of the Cuda Fortran Programming Guide. I get the following error when compiling example 2 on page 14 of the guide. Here is the error message I get try compiling.

ian@ian-Precision-5820-Tower-X-Series:~/document/fortran/nvidia$ nvfortran c0202.cuf
/usr/bin/ld: /tmp/nvfortranJldcP0kBLiQA.o: in function MAIN_': /home/ian/document/fortran/nvidia/c0202.cuf:38: undefined reference to __pgi_dev_cublassasum’
pgacclnk: child process exit status 1: /usr/bin/ld

here is a reformatted source version.
module ramp

real, constant :: twopi


attributes(global) subroutine buildramp(x, n)

  real, device :: x(n)
  integer, value :: n
  real, shared :: term
  if (threadidx%x == 1) term = &
  twopi / float(n)
  call syncthreads()
  i = (blockidx%x-1)*blockdim%x &
  + threadidx%x
  if (i <= n) then
    x(i) = cos(float(i-1)*term)
  end if
end subroutine

end module

program testramp

use cublas
use ramp

integer, parameter :: N = 20000
real, device :: x(N)
twopi = atan(1.0)*8
call buildramp<<<(N-1)/512+1,512>>>(x,N)
!$cuf kernel do
do i = 1, N
x(i) = 2.0 * x(i) * x(i)
end do
print *,"float(N) = ",sasum(N,x,1)

end program

any thoughts?

Here is the output from another example detailing information about the setup.

Device Number: 0
Device Name: Quadro RTX 4000
Total Global Memory: 8.347 Gbytes
sharedMemPerBlock: 49152 bytes
regsPerBlock: 65536
warpSize: 32
maxThreadsPerBlock: 1024
maxThreadsDim: 1024 x 1024 x 64
maxGridSize: 2147483647 x 65535 x 65535
ClockRate: 1.545 GHz
Total Const Memory: 65536 bytes
Compute Capability Revision: 7.5
TextureAlignment: 512 bytes
deviceOverlap: 1
multiProcessorCount: 36
integrated: 0
canMapHostMemory: 1
ECCEnabled: 0
UnifiedAddressing: 1
L2 Cache Size: 4194304
maxThreadsPerSMP: 1024

For cuBLAS, you need to add the flag “-cudalib=cublas” so the libraries are linked in.

% nvfortran -fast c0202.cuf -V22.5
/usr/bin/ld: /tmp/nvfortranQiBKj_IA0B31o.o: in function `MAIN_':
/local/home/mcolgrove/c0202.cuf:37: undefined reference to `__pgi_dev_cublassasum'
pgacclnk: child process exit status 1: /usr/bin/ld
% nvfortran -fast -cudalib=cublas c0202.cuf -V22.5

Thanks very much. I couldn’t find that in the documentation I’ve read so far. It worked.