Unable to compile the PGI provided example for sgemm CUDA FORTRAN

Hello all,
I am new to CUDA FORTRAN and when I was exploring the Cuda examples provided by PGI in the directory
C:\Program Files\PGI\win64\2019\examples\CUDA-Fortran\CUDA-Fortran-Book\appendixC\sgemmDevice
When I compile it with the following statement in PGI 19.4 community edition

pgf90 -o sgemmDevice.exe sgemmDevice.cuf -lcublas

i am getting this error

LINK : fatal error LNK1104: cannot open file ‘sgemmDevice.exe’
./sgemmDevice.exf: error STP001: cannot open file

I am using a windows 10 x64 laptop with Intel i7 processor and Nvidia GTX 1050 ti device.
I have also attached the code down here

!     Copyright (c) 2017, NVIDIA CORPORATION.  All rights reserved.
! NVIDIA CORPORATION and its licensors retain all intellectual property
! and proprietary rights in and to this software, related documentation
! and any modifications thereto.
!    These example codes are a portion of the code samples from the companion
!    website to the book "CUDA Fortran for Scientists and Engineers":
! http://store.elsevier.com/product.jsp?isbn=9780124169708

program sgemmDevice
  use cublas
  use cudafor
  implicit none
  integer, parameter :: m = 100, n = 100, k = 100
  real :: a(m,k), b(k,n), c(m,n)
  real, device :: a_d(m,k), b_d(k,n), c_d(m,n)
  real, parameter :: alpha = 1.0, beta = 0.0
  integer :: lda = m, ldb = k, ldc = m
  integer :: istat

  a = 1.0; b = 2.0; c = 0.0
  a_d = a; b_d = b; c_d = c

  istat = cublasInit()
  call cublasSgemm('n','n',m,n,k, &

  c = c_d
  write(*,*) 'Max error =', maxval(c-k*2.0)
  if (maxval(c-k*2.0) .gt. 0.00000001) then
     write(*,*) "Test Failed"
     write(*,*) "Test Passed"

end program sgemmDevice

Can anyone please take a look at it ?


Looks like you don’t have permissions to write in the directory that you’re compiling. Can you try compiling in a directory you know you have write permissions?

Also, you should try using the flag “-Mcudalib=cublas” instead of “-lcublas” so the compiler will bring in the cuBLAS library that matches the version of CUDA it’s using.

Hope this helps,

Hi Mat,
Thanks a ton. It worked after I made the changes. Thanks again.

Sidarth N